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Description 

The present invention relates to Kex2 derivatives with Kex2 protease activity which are secreted in large amount 
in culture medium, and to a method for their production. The invention also relates to a method of using the aforemen- 

5 tioned secretory Kex2 derivatives. 

Many attempts have been made at methods for producing physiologically active peptides by chimeric protein ex- 
pression, and chemical or enzymatic cleavage methods have been used for release of the desired proteins. Chemical 
methods include -cleavage of asparagine residue with nitrous acid and cleavage of methionine residue with CNBr 
(Itakura et al., Science 198, 1059, 1977). However, these methods necessarily involve modification of the protein of 

10 interest, and problems of purification cost. 

Enzymatic methods employ lysyl endopeptidase which specifically cleaves the peptide bond of the C-terminal of 
lysine (Achromobacter protease I) and Staphylococcal protease V8 which specifically cleaves the peptide bond of the 
C-terminal of the glutamic acid (Japanese Examined Patent Publication No. 6-877B8). However, since these chemical 
methods and endoproteases recognize a single amino acid residue, it is a precondition that amino acid residue not be 

is present in the desired peptide in order to allow efficient excision of the desired peptide from the chimeric protein, and 
thus the peptides which can be produced are limited. Efforts have therefore been directed at developing a highly 
universal cleavage method which recognizes multiple amino acid residues. 

Prohormone converting enzymes are enzymes which produce peptide hormones from their precursors in vivo, and 
they are expected to have favorable qualities as enzymes for excision of peptide hormones from proteins, even in vitro. 

20 Kex2 protease is a prohormone converting enzyme derived from Saccharomyces cerevisiae, and it is a calcium-de- 
pendent serine protease which specifically cleaves peptide bonds at the C-terminal ends of Lys-Arg, Arg-Arg and Pro- 
Arg sequences. Kex2 protease is a protein composed of 814 amino acid residues with a signal sequence at the N- 
terminus and a transmembrane region at the C-terminus with a continuous string of hydrophobic amino acids, and it 
is localized in the trans Golgi in cells. 

25 a nucleotide sequence coding for Kex2 protease and the corresponding amino acid sequence are shown in the 

Sequence Listing as SEQ ID NO.1. Genetic expression of a Kex2 derivative lacking the C-terminal region in Saccha- 
romyces cerevisiae and subsequent analysis thereof revealed that the Kex2 derivative with the amino acid sequence 
from amino acids 1 to 614 of SEQ ID NO.1 retains the Kex2 protease activity, and is secreted in culture medium (Fuller 
et al., Proc. Natl. Acad. Sci. USA, 86, 1434-1438, 1989, Japanese Unexamined Patent Publication No. 1-199578). In 

30 the present specification, the Kex2 protease derivative is represented by the number of amino acids counting from 
amino acid 1 of SEQ ID NO.1 . For example, the Kex2 derivative with the amino acid sequence from amino acids 1 to 
614 of SEQ ID NO.1 is represented as Kex2-614. 

Heretofore known Kex2 derivatives whose secretory production methods have been studied include ss-Kex2 and 
Kex2Ap. 

35 ss-Kex2 is a Kex2 derivative which has a 3 amino acid residue peptide added to Kex2-614, and its production in 

Saccharomyces cerevisiae has been studied (Brenner et al., Proc. Natl. Acad. Sci. USA, 89, 922-926, 1992). It was 
expressed in a protease-deficient mutant (pep4) as a host (in a 4 mg/L culture medium), and was purified from the 
culture supernatant at a purification yield of 20%. The reduced molecular weight of the purified ss-Kex2 treatment with 
Asn-type sugar chain hydrolyzing enzyme EndoH suggests that it includes Asn-type sugar chains. The pH dependency 

^o and substrate specificity of the enzyme activity has also been studied using synthetic substrates. 

Kex2Ap is a Kex2 derivative represented in this specification by Kex2-666, and studies of its production in the 
insect cell host Sf9 have shown that 90% of its activity is secreted into the culture supernatant, and that the molecular 
weight of the secreted Kex2Ap is 70 kDa, which is smaller than the intracellular molecular weight of 120 kDa (Germain 
et al., Eur. J. Biochem. 204, 121-126, 1992). In addition, since the 70 kDa molecular weight protein is found in the 

45 culture supernatant in which Kex2 is expressed, and replacement of the 385th serine residue by alanine residue of 
Kex2Ap (the catalytic portion of Kex2 protease activity) results in Kex2Ap in the culture supernatant with a molecular 
weight of 1 20 kDa, equal to the intracellular molecular weight, the 70 kDa protein is believed to be an autolysate of the 
C-terminal portion-deficient Kex2Ap (120 kDa) in the culture medium. 

Attempts have also been made at expression of the derivative Kex2A504 in which the cleavage site of the Lys- 

50 Arg sequence (amino acids 503-504 of SEQ ID NO.1), expected from the molecular weight of the decomposition 
product and the substrate specificity of Kex2 protease, is replaced with the Lys-Leu sequence. However in this case 
as well a 70 kDa protein is found in the culture medium, and since the Lys-Arg sequence is not always cleaved by 
Kex2Ap during autolysis, and no other sequence exists as the recognition site of Kex2 protease, this suggests the 
possibility that Kex2A504 recognizes a completely different sequence than the one predicted from the synthetic sub- 

55 strate, and cleaves itself. 

Thus, despite research on substrate specificity of Kex2 derivatives using synthetic substrates, the substrate spe- 
cificity when using proteins is not yet understood. Also, little is known about the secreted amounts of the different Kex2 
derivatives, and it is still not known whether stable secretory production of Kex2 derivatives other than Kex2-614 is 
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possible. 

The general aim herein is to provide new and useful techniques for supplying a large amount of enzymes with 
Kex2 protease activity, with demonstration of uses of such enzymes. 

Specifically, the use of enzymes with Kex2 protease activity for the production of useful peptides by chimeric protein 
s expression on an industrial scale particularly calls for one or more of the following three issues to be addressed. 

A first issue is to increase the amount of it of production of the Kex2 derivatives. The enzyme having Kex2 protease 
activity with the greatest yield hitherto reported has been ss-Kex2, and that yield is about 4 mg.per 1 L of culture 
medium. However, this yield is low in terms of production of enzyme for excision of desired peptides from chimeric 
proteins on an industrial scale. Also, since secretory Kex2 derivatives such as Kex2Ap are believed to possibly undergq 
10 autolysis in culture medium and the cleavage site cannot be predicted, it is unknown how to design derivatives to 
increase the yield. 

Consequently, it would be advantageous to find Kex2 derivatives which do not undergo autolysis and to construct 
a high-expression system for those Kex2 derivatives. In the present specification, autolysis refers to decomposition 
which brings a reduction in Kex2 protease activity, and does not refer to maturation of Kex2 protease which accompa- 
*5 nies the autocleavage of Lys-Arg (amino acids 108-1 09 of SEQ ID NO. 1) (Brenner & Fuller, Proc. Natl. Acad. Set. USA, 
89, 922-926, 1992). 

A second issue is establishment of a purification process for high purity Kex2 derivatives without contamination 
by other proteases. The activity of Kex2 derivatives hitherto reported are evaluated using only synthetic substrates 
and not protein substrates, and thus the presence of contamination by other proteases cannot be determined. In par- 

20 ticular, it is difficult to achieve precise control of reaction conditions when excising chimeric proteins on an industrial 
scale, and since contamination by other proteases can notably lower the recovery rates of the object peptides, the 
enzymes with Kex2 protease activity should desirably be of a high degree of purity. 

A third issue is setting the conditions for cleaving chimeric proteins by enzymes with Kex2 protease activity. It is 
well-known to those in the art that the tertiary structure of proteins affects enzyme activity, the enzyme stability under 

25 the reaction conditions and recognition of the substrate. However, almost no previous reports have dealt with these 
points. In particular, since chimeric proteins often form insoluble inclusion bodies in chimeric protein expression meth- 
ods, denaturing agents such as urea are used for their solubilization. However, it is generally unknown what enzyme 
structure can retain enzyme activity in the presence of urea. Consequently, it is unclear whether or not Kex2 protease 
and secretory Kex2 derivatives can be used as enzymes for excision of desired peptides from proteins. 

30 Mass production of other prohormone converting enzymes has also been unsuccessful, and it is also unknown 

whether these enzymes can be used as enzymes for excision of desired peptides Irom chimeric proteins in vitro. 
Consequently, in order to use prohormone converting enzymes, including Kex2 protease, as enzymes for releasing 
desired peptides from chimeric proteins, it is important to establish more efficient expression and purification methods, 
and set the cleavage conditions for using proteins as substrates in vitro. 

35 As a result of research carried out with these issues in mind, the present inventors have found that Kex2 derivatives 

having amino acid sequences from position 1 at the N-terminus to an amino acid at a position between 618 and 698 
have notably higher secretory production without undergoing autolysis in culture, and that the production may be further 
increased by using methylotrophic yeast as the host cells. This can be used for mass supply of Kex2 derivatives. In 
addition, the inventors purified the secretory Kex2 derivatives from culture supernatant concentrates to single bands 

40 inSDS-PAGE by the 2 steps of anion exchange chromatography and hydrophobic chromatography, and have confirmed 
that, under conditions in which desired peptides are excised from chimeric proteins, the purified Kex2 derivatives con- 
tain no other protease activity which decomposes the peptides and lowers the recovery rate. 

Furthermore, it was found that under conditions in which desired peptides are excised from chimeric proteins, the 
substrate specificity of secretory Kex2 derivatives is altered by changing urea concentration, and have demonstrated 

45 that a desired peptide can be excised from a chimeric protein at an efficiency of 75% even when the desired peptide 
includes 2 recognition sites for Kex2 protease. It was also demonstrated that Kex2-660 can be used to excise hPTH 
(1 -34) from the chimeric protein pGal-117S4HPPH34 on a semi-large scale, i.e. that for the secretory Kex2 derivative, 
the yield, purity and excision efficiency of the desired peptide from the chimeric protein can be suitable for production 
on an industrial scale. 

50 in one aspect herein we provide proteins with Kex2 protease activity which are obtained by transforming host cells 

with an expression vector comprising DNA coding for a "natural" amino acid sequence whose N-terminus is the Met 
at amino acid 1 and whose C -term in us is one of the amino acids between amino acids 618 and 698 of the amino acid 
sequence of the Kex2 protease represented by SEQ ID NO.1, or an amino acid sequence which is this natural amino 
acid sequence modified by a substitution, deletion or addition of one or more amino acids, and then culturing the 

55 resulting transformants and recovering the protein from the culture. In the specification, such proteins are collectively 
referred to as "enzymes with Kex2 protease activity", "Kex2 protease derivatives', "secretory Kex2 derivatives", etc. 

Other aspects of the invention provide genes, particularly DNA, coding for the aforementioned proteins, vectors, 
particularly expression vectors, comprising the aforementioned DNA, and transformants, preferably animal cells or 
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yeast, obtained by transforming host cells with the aforementioned vector. 

Another aspect is a method for producing the aforementioned proteins, comprising the steps of cutturing a host 
which has been transformed with the aforementioned expression vector and recovering the aforementioned protein 
from the culture. The protein is preferably recovered from the culture supernatant by anion exchange chromatography 
5 and hydrophobic chromatography. 

The present invention still further provides a method for excision of desired peptides from chimeric proteins using 
the aforementioned proteins. Chimeric protein is a protein obtained by adding a protective peptide to a desired peptide, 
and the desired peptide may be excised by the aforementioned protein so long as the link between the desired peptide 
and the protective peptide is an amino acid sequence recognized by the aforementioned protein. Also, even if the 
10 junction between a desired peptide and a protective peptide is not an amino acid sequence recognized by the afore- 
mentioned protein, a recognition site of the aforementioned protein may be inserted between the desired peptide and 
the protective peptide to allow the desired peptide to be excised using the aforementioned protein. 

BRIEF DESCRIPTION OF THE DRAWINGS 

15 

Fig. 1 shows the sequences of a synthetic oligomers used for construction of a synthetic hProPTH (1-84) gene. 
Fig. 2 shows a process for constructing the synthetic hProPTH (1-84) gene. 

Fig. 3 shows a process for constructing plasmid pG2lOS(S/X). Plac represents the E . coli lactose operon promoter 
and Ttrp represents the £ coli TrpE attenuator terminator. 
20 Fig. 4 shows a process for constructing plasmid pGP#19 which expresses the chimeric protein pGal-139S(FM) 

PPH84. 

Fig. 5 shows a process for constructing plasmid pPTH(1-34)proa. 

Fig. 6 shows a process for constructing plasmid ptacCATPTH(1-34) which expresses the chimeric protein 
CATPH34. Ptac represents a synthetic promoter of the -35 region of trp promoter and the -10 region of Plac. 
25 Fig. 7 is a photograph of SDS-PAGE for a sample of the chimeric protein CATPH34 expressed by £ coli, before 

and after purification. 

Fig. 8 shows a process for constructing plasmid pGP#19PPH34 which expresses the chimeric protein pGal- 
139SPPH34. 

Fig. 9 shows the former steps in a processfor constructing plasmid pG11 7S4HPPH34 which expresses the chimeric 
30 protein pGal-1 17S4HPPH34. 

Fig. 10 shows the latter steps in the process for constructing plasmid pG117S4HPPH34 which expresses the 
chimeric protein pGal-117S4HPPH34. 

Fig. 11 shows the structure of the KEX2 gene and the sequences of the primers synthesized for construction of 
the secretory Kex2 derivative genes, and their respective annealing sites. 
35 Fig. 12 shows a process for constructing plasmid pYE-660 which expresses a secretory Kex2 derivative. PKEX2 

represents a promoter for the KEX2 gene of Saccharomyces cerevisiae. 

Fig. 13 shows a process for constructing plasmid pYE-614 which expresses Kex2-614. 

Fig. 14 is a graph comparing Kex2 activity per OD660 of secretory Kex2 derivatives using a synthetic substrate. 
The relative activities of the cultures of secretory Kex2 derivative producing strains are given taking the activity of 
40 K16-57C[pYE-614]as 1. 

Fig. 15 is a photograph of an electrophoresis which gives a comparison of yields per 200 u.l of culture supernatant 
of secretory Kex2 derivatives. 

Fig. 16 is a graph showing the activities of Kex2-660 at different urea concentrations, using the synthetic substrate 
Boc-Leu-Arg-Arg-MCA. The relative activities at each urea concentration are given taking the Kex2-660 activity in the 
45 absence of urea as 100%. 

Fig. 17 is a graph comparing the recovery rates of (JGa!(1-14), hPTH(1-84), hPTH(1 -44) and [hPTH(1-84) + hPTH 
(1-44)] from pGal-1 39S(FM)PPH84, at different urea concentrations (1 .5-3.0 M). 

Fig. 18 is a graph comparing the recovery rates of (JGal(1-14), hPTH(1-84), hPTH(1-44) and [hPTH(1 -84) + hPTH 
(1-44)] from pGal-139S(FM)PPHB4, at different urea concentrations (3.0-4.0 M). 
so Fig. 19 shows an elution profile of HPLC for before and after Kex2-660 processing of the chimeric protein pGal- 

139S(FM)PPH84 and a schematic representation of the relationship between identified peptide fragments and pGal- 
139S(FM)PPHB4. The peak numbers in the profile correspond to the numbers of the fragments. Fragments 1 , 2, 3 and 
4 were identified by determining the amino acid sequences. Fragment 7 was estimated based on elution time, and 
fragment 5 was estimated by correlation between pGal(1-14) and hPTH(1-84). Fragment 6 was so designated for 
55 fragments which may be eluted. 

Fig. 20 is a graph comparing the recovery rates of hPTH(1-84), hPTH(1-44) and hPTH(45-84) from PGal-1 39S 
(FM)PPH84, at different enzyme concentrations. The solid squares, open circles, solid circles and solid triangles rep- 
resent, respectively, recovery rates for pGal-139S(FM)PPH84, hPTH(1-84), hPTH(1-44) and hPTH(45-84). The recov- 
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ery rates were calculated in the following manner. For hPTH(1-84) and pGal-139S(FM)PPH84, the peak area ratio 
against a known concentration of a corresponding standard substance was used, and for hPTH(1 -44) and hPTH(45-84) 
the peak area ratio against a known concentration of hPTH(1-84) was used, and compensation was made based on 
the number of amino acid residues of the corresponding peptides. 
5 Fig. 21 shows an elution profile of HPLC for before and after Kex2-660 processing of the chimeric protein CATPH34. 

Fig. 22 shows a process for constructing plasmid pCU660 which expresses Kex2*660. 

Fig. 23 is a photograph of SDS-PAGE which shows the secretion of Kex2-660 in culture supernatants for different 
culturing times of TK62/pCU660#10. 

Fig. 24 shows a process for constructing plasmid pG210ShCT[G]. 
10 Fig. 25 is a graph comparing yield of each secretory Kex2 derivative per OD660 of culture based on Kex2 activities 

using a synthetic substrate. The yields of each of the secretory Kex2 derivatives are given with the yield of K16-57C 
[PYE22-614] as1. 

Fig. 26 is a photograph of SDS-PAGE which gives a comparison of yields per 200 uJ of culture supernatant of 
secretory Kex2 derivatives. Lanes 1 and 12 are developed from molecular weight markers, and lanes 2 through 11 
is from concentrates of culture supernatants of Kl6-57C[pYE-22m], K16-57C[pYE22-6l4], Kl6-57C(pYE22-630], 
Kl6-57C[pYE22-640], Kl6-57C[pYE22-650], K16-57C[pYE22-660), K1 6-57C[pYE22-679], K16-57C[pYE22-682], 
K16-57C[pYE22-688] and Kl6-57C[pYE22-699J. In this representation of the electropherogram, the numbers to the 
left of lane 1 indicate the size (kDa) of the molecular weight markers. 

Fig. 27 shows a process for the construction of Kex2-660 expression plasmid pHIL-660 for a host Pichia pastoris. 

20 

DETAILED DESCRIPTION 

As will be explained hereunder, the proteins according to the present invention differ considerably in terms of 
production and secretion efficiency depending on the length of the protein and particularly the position on the C-termi- 

25 nus. We find that these are proteins of lengths which give high production and secretion efficiency; the Kex2 derivatives 
have amino acid sequences from Met at amino acid 1 to any of the amino acids at positions 618 to 698 of the amino 
acid sequence represented by SEQ ID NO.1. The C-terminus of a Kex2 protease derivative of the present invention 
is preferably any one of the amino acids from the position 630 to the position 688 of the amino acid sequence of SEQ 
ID NO.1 , more preferably it is any one of the amino acids from the position 360 to the position 682 of the amino acid 

30 sequence of SEQ ID NO. 2 and more preferably it is any one of the amino acids from the position 630 to the position 
679. The above-mentioned amino acid sequences composed of portions of the amino acid sequence of SEQ ID No.1 
are sometimes referred to as natural amino acid sequences for the purpose of the present invention. 

However, it is well-known among those in the art that enzyme activity can be maintained even with substitutions 
of multiple amino acids by other amino acids, deletion of amino acids or addition of amino acids in regions of amino 

35 acid sequences of an enzyme protein other than those regions participating in their activity. Therefore, the present 
invention also encompasses, in addition to Kex2 protease derivatives having the aforementioned natural amino acid 
sequences, also proteins with Kex2 protease activity having amino acid sequences which are the aforementioned 
natural amino acid sequences modified by a substitution, deletion or addition of one or more amino acids e g modifi- 
cations between said terminus portions, and e.g. usually modifications entailing the substitution, deletion or addition 

40 of from 1 to 5 or 10 or 15 or 20 or 30 amino acids, always subject to activity being present. 

The present invention still further provides genes, particularly DNA, coding for the aforementioned various polypep- 
tides. The DNA may be prepared according to a conventional method, for example from full-length DNA having the 
nucleotide sequence represented by SEQ ID No. 1 or another nucleotide sequence coding for the same amino acid 
sequence, or by cleaving a DNA containing the object DNA, and linking the cleavage product to an oligonucleotide if 

45 desired or by introducing a translation termination codon at a suitable location in the DNA. Alternatively, DNA coding 
for one of the aforementioned modified amino acid sequences may be prepared by a conventional method such as 
site-directed mutagenesis or the polymerase chain reaction (PCR), using the natural full-length DNA having the nu- 
cleotide sequence represented by SEQ ID NO. 1 or a fragment thereof as a template, and using a primer oligonucleotide 
containing a desired mutation as a mutagenic primer. 

50 The expression vector according to the invention contains an expression regulating region such as a promoter 

which is functional in the host used. For example, when yeast cells are used as the host, glyceraldehyde-3-phosphate 
dehydrogenase promoter, glycerophosphate kinase promoter, acid phosphatase promoter, alcohol oxidase promoter, 
formate dehydrogenase promoter, methanol oxidase promoter or the like may be used. 

The host cells used according to the invention may be yeast cells. The yeast cells are preferably from Saccharo- 

55 myces, Pichia, Hansenula or Candida, which include Saccharomyces cerevisiae, Pichia pastoris, Hansenula polymor- 
pha and Candida boidinii. Especially preferred yeast is methylotrophic yeast, such as the genera Candida and Pichia, 
such as Candida beidiniiaud Pichia pastoris. 

In order to render enzymes with Kex2 protease activity usable as enzymes for excision of peptides from chimeric 
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proteins on an industrial scale, the present inventors determined that it was necessary to solve the problems of 1) 
providing high yields for large supply, 2) high purity, without contamination with other proteases cleaving the desired 
peptide and 3) establishing conditions for cleavage of substrate protein, and to this end the inventors carried out the 
following investigation. 

£ First, we studied the conditions for increasing the yield of enzymes with Kex2 protease activity to deal with the first 

object. For the purification on an industrial scale of a large amount of enzymes with Kex2 protease activity, not only 
must the yield be high, but the purification thereof must also be simple, and for this purpose the inventors considered 
it advantageous to secrete the Kex2 derivatives in culture medium containing few other proteins. The secretory Kex2 
derivative ssKex2 has been reported as mentioned above, but its yield is 4 mg/L culture medium which is too low for 

10 use on an industrial scale. Thus, genes coding for different secretory Kex2 derivatives were prepared and expressed 
in Saccharomyces cerevisiae hosts to examine the secretion yields. 

The Kex2 derivatives used for the invention are Kex2-614, Kex2-630, Kex2-640, Kex2-650, Kex2-660, Kex2-679, 
Kex2-6B2, Kex2-688 and Kex2-699. For the expression of Kex2 derivative genes, the Saccharomyces cerevisiae glyc- 
eraldehyde-3-phosphate dehydrogenase (GAP) gene promoter was used. Plasmids containing these expression units 

15 were introduced into Saccharomyces cerevisiae, which was then cultured overnight at 30° C, and the Kex2 protease 
activities in the cultures were measured using the synthetic substrate Boc-Leu-Arg-Arg-MCA as the substrate. 

As a result, Kex2 protease activity was detected in the culture supernatants of the yeast in which expression units 
of genes for Kex2-614, Kex2-630, Kex2-640, Kex2-650, Kex2-660, Kex2-679, Kex2-682 or Kex2-688 had been intro- 
duced, but no activity was detected in the culture supernatant of the yeast in which the expression unit of the Kex2-699 

20 gene had been introduced. This demonstrated that secretion of Kex2 derivatives in culture supernatants can be 
achieved using derivative genes including amino acid residues from position 1 to position m (m = 614 to 688) from the 
N-terminus. 

Furthermore, the secretion yields per OD660 were found in Example 1 to be significantly higher for Kex2-660 and 
Kex2-679 than for the hitherto reported Kex2-614. Also, the results from analysis of SDS-PAGE of samples prepared 

25 to 20-fold concentration by ultrafiltration membrane of 10,000 molecular weight fraction of the culture supernatants 
showed that not only the Kex2 activity but also the amounts of secretion of Kex2-660 and Kex2-679 were greater than 
Kex2-614. It was demonstrated that the molecular weights increased with greater numbers of amino acid residues, i. 
e. no autolysis accumulated in this culturing as occurred with Kex2Ap in the insect cell host Sf9. 

Furthermore, in Example 9 it was shown that the OD660 Kex2 activities of cultures of Kex2-630, Kex2-640, 

30 Kex2-650, Kex2-660 and Kex2-679 were at least 10 times higher than the hitherto reported Kex2-614, and the Kex2 
activities of Kex2-6S2 and Kex2-688 were 6 times and 3.4 times greater, respectively, than Kex2-614, while the Kex2 
activity of Kex2-699 was undetectable. 

In other words, it was shown that the Kex2 activity of the culture increases when the C-terminal amino acid residues 
of the expressed Kex2 derivative are up to the 630-679th amino acid residues of Kex2, while the activity decreases 

35 as the C-terminal region extends beyond that length. 

Also, the results of SDS-PAGE analysis confirmed that the amount of Kex2 secretion had increased. It was also 
demonstrated that the molecular weights increase with more amino acid residues of the Kex2 derivatives, and thus no 
autolysates accumulated in this culturing as occurred with Kex2Ap production in the insect cell host Sf9. 

Since the secretory Kex2 derivatives prepared here were found to accumulate in the culture supernatants without 

40 undergoing autolysis, the Kex2-660 production test was conducted changing the expression system from the Saccha- 
romyces cerevisiae system to for example an expression system with the methylotrophic yeast Candida boidinii as the 
host, which has a high production yield per culture. As a result, it was possible to increase the yield to 340 mg per 1 
L of culture supernatant. This is the amount capable of releasing about 200 g of the physiologically active peptide 
hPTH(1-34) from a chimeric protein, and thus it was demonstrated that the present invention is able to supply an 

45 amount of enzyme necessary for excision of useful peptides from chimeric proteins on an industrial scale. In addition, 
it was found that yeast of the genus Candida is especially preferable as host. 

The second problem to be dealt with was the purity of the secretory Kex2 derivatives. First, Kex2-660 which had 
the largest secretion yield was purified from the culture supernatant. Kex2-660 secreted extracellularly by Saccharo- 
myces cerevisiae was purified to a single band (57% yield) by concentrating the culture supernatant with ultrafiltration 

so (molecular weight 30,000) in the presence of 0.2 mM calcium, and subjecting it to anion exchange chromatography 
and hydrophobic chromatography. This recovery rate was the highest yet reported, thus demonstrating that this method 
allows high purity Kex2 derivatives to be supplied in large amounts. 

Also, in order to determine the substrate specificity of the Kex2 derivatives using protein substrates, as well as the 
possibility of contamination with other proteases, an excess of purified Kex2-660 was allowed to act on the chimeric 

55 protein fJGal-1 39S(FM)PPH84, and the structures of the resulting peptides were determined. PGal-139S(FM)PPH84 
is a chimeric protein prepared by linking hPTH(1-84) via a Phe-Met sequence and a human parathyroid hormone- 
derived prosequence (Lys-Ser-Val-Lys-Lys-Arg) to (JGal-1 39S, which is a polypeptide from the N-terminus to the 139rd 
amino acid residue of B. coli p-galactosidase which has been substituted with serine residues at its 76th and 122nd 
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cysteine residues. The amino acid sequence of f*Gal-139S is represented as SEQ ID NO.2, and the amino acid se- 
quence of hPTH(1-84) is represented as SEQ ID NO.3. 

As a result, it was found that the sequence at the N-terminus of the resulting peptide is derived from the peptide 
fragment expected from the substrate specificity of Kex2 protease, and that the purified Kex2-660 had no contamination 
s by other proteases which might interfere with excision of the desired peptide from the chimeric protein. 

The purified Kex2-660 was used to investigate the cleavage conditions when using a protein as the substrate, in 
order to deal with the third object. 

First, we studied the effect of urea, which is commonly used for releasing desired peptides from chimeric proteins. 
Kex2-660 was allowed to act on the synthetic substrate Boc-Leu-Arg-Arg-MCA in the presence of 0 to 4.0 M urea, and 
10 it was found that at concentrations of 1.0 M, 2.0 M and 4.0 M the activities were reduced to 70%, 40% and 10%, 
respectively, compared to absence of urea. When insoluble inclusion bodies are dissolved in a urea solution and pro- 
tease acts thereon, the concentration of the urea solution is generally 2.0 to 4.0 M. Thus, it was concluded that Kex2-660 
can be used for excision of desired peptides from chimeric proteins, if the dissolution conditions of the chimeric proteins 
are appropriately determined. 

15 Next, the effect of 1.5 to 3.0 M urea on action of Kex2-660 on the chimeric protein pGal-139S(FM)PPH84 was 

investigated. The sequences predicted to be cleaved by Kex2 protease are at the 4 sites of Arg-Arg (amino acids 1 3-1 4 
of SEQ ID NO.2, hereunder referred to as cleavage site A), Lys-Arg (prosequence portion, hereunder referred to as 
cleavage site B) and Pro-Arg (amino acids 43-44 and 51-52 of SEQ ID NO.3, hereunder referred to as cleavage sites 
C and D, respectively). The C-terminal ends of each of the sites are cleaved by the protease. 

20 As a result of investigating the structures and amounts of the peptide fragments produced by the Kex2 protease 

processing, it was found that higher urea concentrations in the range of 1 .5 to 2.5 M gave higher cleavage efficiency 
by Kex2-660, while there was no difference in cleavage efficiency at urea concentrations of 2.5 M and 3.0 M. In addition, 
with regard to the effect of urea on substrate specificity, it was found that the cleavage efficiency at cleavage site B 
improved as the urea concentration increased, but the cleavage efficiency at cleavage site C reached a peak at 2.5 M 

25 urea concentration, and thus higher urea concentrations gave better yields of hPTH(1-84) from the chimeric proteins. 
Also, no cleavage was found at cleavage site D. The same tendency was observed even at urea concentrations of 3.0 
to 4.0 M. These discoveries were unpredictable from using synthetic substrates, and were first arrived at by the present 
invention. 

The inventors then studied the conditions for excision of hPTH(1 -84) from PGal-1 39S(FM)PPH84 using Kex2-660. 

30 Kex2-660 was used at different proportions (25 kU, 50 kU, 1 00 kU, 1 50 kU and 200 kU of Kex2-660 to 1 mg of chimeric 
protein) under conditions at which hPTH(1-S4) is excised from chimeric proteins, and the structures of the resulting 
peptide fragments and -their yields were examined. When 50 kU/ml of Kex2-660 was used, hPTH(1-84) was excised 
at an efficiency of about 75%. Here, about 10% of the pGal-1 39S(FM)PPH84 remained. 

pGal-1 39S(FM)PPH84 decreased with increasing amounts of Kex2-€60, and almost completely disappeared at 

35 200 kU/ml. However, it was also found that the proportion of hPTH{1-44) and hPTH(45-84) also increased simultane- 
ously, while the efficiency of hPTH(1-84) underwent no increase (Fig. 20). On the other hand, no decrease in the 
amount of hPTH(1-84) was seen beyond the increase in hPTH(1-44) even with increasing amounts of Kex2-660, and 
thus it was confirmed that the Kex2-660 purified in Example 2 described hereunder had no contamination by other 
proteases with different substrate specificities than Kex2 protease under conditions at which hPTH(1-84) is excised 

40 from chimeric proteins. 

Furthermore, it was demonstrated that selection of the reaction conditions allows desired peptides to be efficiently 
excised (with an excision of efficiency of 75%) from chimeric proteins even when the desired peptide includes a cleavage 
site for the Kex2 protease. This excision of efficiency is higher than the excision of efficiency of 50% for hPTH(1 -84) 
using factor-Xa (Gardella et al., J. Biol. Chem. 265, 26, 15854-15859, 1990). 

45 Gardella et al. suggested the possibility that contaminating proteases or factor-Xa itself degrades hPTH(1-84), 

judging from lower hPTH(1 -84) recovery rates when the enzyme amount is increased or the reaction time is extended, 
despite the fact that hPTH(1-84) does not include the factor-Xa recognition site, i.e. the lle-Glu-Gly-Arg sequence. The 
fact that hPTH(1-84) is obtained at a high recovery rate despite the fact that hPTH(1-84) includes 2 sites of cleavage 
sequences for Kex2 protease suggests that the purified Kex2 derivatives with increased yields according to the inven- 

50 tjon are useful as enzymes for excision of desired peptides from chimeric proteins. 

Furthermore, it was found that the purified Kex2-660 can excise hPTH(1-34) from the soluble chimeric protein 
CATPH34 in the absence of urea and from the insoluble chimeric protein pGal-11 7S4HPPH34 in the presence of urea, 
and thus it functions even when the substrates are chimeric proteins with different protective peptides and cleavage 
site regions, showing that it has wide industrial application. Also, no protease contamination was detected even in the 

55 absence of urea. 

In other words, it was found that secretory Kex2 derivatives with increased yields which are purified to a single 
band degrade desired peptides under conditions in which the desired peptides are released from chimeric proteins, 
irrespective of the presence or absence of urea, and have no contamination of other proteases which lower the recovery 
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rates, that selection of the conditions allows these Kex2 derivatives to recover the desired peptides very efficiently 
even when the desired peptides include recognition sites for the Kex2 proteases, and that the amounts of expression 
per 1 L ot culture medium are sufficient for release of about 200 g of the desired peptides and thus the secretory Kex2 
derivatives obtained according to the invention are supplied in amounts necessary for excision of desired peptides 
s from chimeric proteins on an industrial scale. 

EXAMPLES 

The present invention will now be explained in more detail by way of the following Reference Examples and Ex- 
70 amples which, however, are not intended to restrict the invention. The plasmids, E. co//and yeast used as materials 
for the invention and the basic experimental procedures employed for all of the examples will be described first, and 
then the Reference Examples and Examples will be presented. 

Plasmids 

75 

Plasmid pG97S4DhCT[G] is a plasmid which is capable of expressing a chimeric protein wherein hCT[G] (a peptide 
resulting from addition of a glycine residue to the C-terminus of the 32nd amino acid of human calcitonin) has been 
linked to a peptide comprising the region from the N-terminus to the 97th amino acid of p-galactosidase (where the 
76th cysteine residue is replaced by a serine residue and the 40th, 41st, 71st and 75th glutamic acid residues are 
20 replaced by aspartic acid residues: named pGal-97S4D) via a glutamic acid residue, under the E. coli lactose operon 
promoter. 

By introducing a DNA region coding for the desired peptide in reading frame as an EcoRI-Sall DNA fragment, it is 
possible to express a chimeric protein with pGal-97S4D. The E. coli strain W3110 containing this plasmid was named 
Escherichia coli SBM323, and was deposited at the National Institute of Bioscience and Human Technology on August 

25 8, 1 991 as FERM BP-3503. 

Plasmid ptacCAT is a plasmid which is capable of expressing the chloramphenicol acetyltransf erase gene under 
the synthetic promoter tac. The E. co//strain JM109 containing this plasmid was named Escherichia co// SBM336, and 
was deposited at the National Institute of Bioscience and Human Technology on March 1, 1996 as FERM BP-5436. 
pG97S4DhCT[G] and ptacC AT were used as materials to construct the soluble hPTH(1 -34) chimeric protein -expressing 

30 vector ptacCATPTH(1-34) (Reference Example 2 and Figs. 5 and 6). 

Plasmid pG210ShCT[G] is a plasmid in which the gene coding for pGal«97S4D from pG97S4DhCT[G] is replaced 
with the gene coding for pGal-21 OS (a peptide consisting of the N-terminus to the 21 0th amino acid of p-galactosidase, 
wherein the 76th, 122nd and 154th cysteine residues are replaced with serine residues). 

Plasmid PG2lOShCT[G] can be obtained by linking of a DNA fragment containing the gene coding for pGal-210S 

35 obtained by digesting pGHct210(Ser)rop with restriction enzymes Pvull and EcoRI and a DNA fragment containing a 
vector portion obtained by digesting pG97S4DhCT[G] with restriction enzymes Pvull and EcoRI (Fig. 24). A method 
for constructing pGHot210(Ser)rop is disclosed in Japanese Examined Patent Publication No. 6-87788. pG210ShCT 
[G] was used as material for cloning of a synthetic human parathyroid hormone precursor (hProPTH(1-84)) gene and 
construction of plasmid pGP#19 (Reference Example 1 and Fig. 5). 

40 Plasmid pCRII was acquired from Invitrogen Co. and used for direct cloning of the PCR products. 

Plasmid pYE-22m is an expression vector which utilizes the promoter and terminator for the glyceraldehyde- 
3-phosphate dehydrogenase (GAP) gene and has a multicloning site (MCS: EcoRI-Sall), with the promoter at the 
EcoRI end, the TRP1 gene as the selective marker, and a 2 pm DNA portion (inverted repeats) at the replication origin. 
The E. co//strain JM109 containing plasmid pYE-22m was named Escherichia co//'SBM335, and was deposited at the 

45 National Institute of Bioscience and Human Technology on March 1 , 1 996 as FERM BP-5435 (Fig. 12). 

Plasmid pYE-KEX2 (5.0)b (Mizuno et al., Biochem. Biophys. Res. Commun. 156, 246-254, 1988) was used as a 
template to construct Kex2 derivative genes by the PCR (Fig. 12). Plasmid pYE-KEX2 (Rl-Pvull) (Japanese Unexam- 
ined Patent Publication No. 1 -1 99578) was used to construct the expression vector pYE-614 for a protein comprising 
a peptide of 14 amino acids (SEQ ID NO.4) at the C-terminus of Kex2-614 (Fig. 13). 

50 Plasmid pNOTell is an expression vector which utilizes the promoter and terminator for the alcohol oxidase gene 

and which includes a restriction enzyme Notl site, with the URA3 gene as the selective marker (Japanese Unexamined 
Patent Publication No. 5-344895). 

E. coli and yeast 

55 

The competent cell line £. coli JM109 was acquired from Toyobo and used for plasmid preparation and chimeric 
protein expression. E. coli JM101 and M25 (Sugimura et al., Biochem. Biophys. Res. Commun. 153, 753-759, 1988) 
were used for production of the chimeric proteins CATPH34 and pGal-117S4HPPH34, respectively. The hosts used 
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tor secretory expression of the Kex2 proteases were Saccharomyces cerevisiae K1 6-57C (MAT a Ieu2 trpl ura3 kex2-8: 
Mizuno et al., Biochem. Biophys. Res. Commun. 156, 246-254, 1988) and Candida boidinii TK62. 

TK62 is a uracil-requirtng ceil line obtained by URA3 mutation from Candida boidinii S2AOU-1 (Sakai, Y. et al., J. 
Bacterid., 173, 7458-7463, 1991). This Candida boidinii S2AOU-1 strain was named Candida boidinii SAM 1 958, and 
5 was deposited at the National Institute of Bioscience and Human Technology on February 25, 1 992 as FERM BP-3766. 

Culture media 

For culturing of E. coli, an LB medium (0.5% (w/v) yeast extract, 1% (w/v) tryptone, 0.5% (w/v) NaCI), SB medium 
10 (1.2% (w/v) yeast extract, 2.4% (w/v) tryptone, 0.5% (v/v) glycerol), SB2 medium (2% (w/v) yeast extract, 1% (w/v) 
tryptone, 0.5% (v/v) glycerol, 1.2% (w/v) K 2 HP0 4 , 0.3% (w/v) KH 2 P0 4 ) and NU medium (0.3% (w/v) yeast extract, 
1.5% (w/v) glucose, 0.3% (w/v) KH 2 P0 4 , 0.3% (w/v) K 2 HP0 4 , 0.27% (w/v) Na 2 HP0 4 , 0.12% (w/v) (NH 4 ) 2 S0 4 , 0.2 g/ 
L NH 4 CI, 0.2% (w/v) MgS0 4 , 40 mg/L FeS0 4 7H 2 0, 40 mg/L CaCI 2 -2H 2 0, 10 mg/L MnS0 4 -nH 2 0, 10 mg/L AICI 3 <6H 2 0, 
4 mg/L CoCl2-6H 2 0, 2 mg/L ZnS0 4 -7H 2 0, 2 mg/L Na 2 Mo0 4 -2H 2 0, 1 mg/L CuCl2-7H 2 0, 0.5 mg/L H 3 B0 3 ) were used. 
*5 For culturing of Saccharomyces cerevisiae, YCDP medium (1% (w/v) yeast extract, 2% (w/v) casamino acid, 2% 

(w/v) glucose, 100 mM potassium phosphate (pH 6.0)) was used. 

For culturing of Candida boidinii and Pichia pastor is, BMGY medium (1% (w/v) yeast extract, 2% (w/v) peptone, 
1% (v/v) glycerol, 1 .34% (v/v) YNB w/o AA: Yeast Nitrogen Base without Amino Acids, 0.4 mg/L biotin, 100 mM potas- 
sium phosphate (pH 6.0)), BMMY medium (1% (w/v) yeast extract, 2% (w/v) peptone, 0.5% (v/v) methanol, 1.34% (W 
20 v) YNB w/o AA, 0.4 mg/L biotin, 100 mM potassium phosphate (pH6.0)), YPD medium (1% (w/v) yeast extract, 2% (w/ 
v) peptone, 2% (w/v) glucose) and YPGM medium (1% (w/v) yeast extract, 2% (w/v) peptone, 3% (v/v) glycerol, 1% 
(v/v) methanol, 1.34% (v/v) YNB w/o AA, 50 mM potassium phosphate (pH6.0)) were used. 

Basic experimental procedure 

25 

Unless otherwise specified, the experimental procedures in the Reference Examples and Examples were accord- 
ing to the following methods. 

The DN A primers were synthesized by the phosphoramidite method using an automatic synthesizer (Model 380A, 
Applied Biosystems). The DNA nucleotide sequences were determined by the dideoxy method. 

30 Cleavage of the DNA with restriction enzymes was accomplished by reaction for one hour using 3- to 10-fold 

amounts of the enzyme as indicated by the manufacturer. Analysis of the plasmid structures was made using 0.5 to 1 
u.g of DNA in a 20 uJ reaction solution, and the DNA was prepared using 3 to 10 u.g of DNA in a 50 to 100 uJ reaction 
solution. The reaction temperature and reaction buffer conditions were as indicated by the manufacturer 

Agarose gel electrophoresis samples were prepared by adding a 1/5 volume of a pigment solution (15% (w/v) 

35 Ficoll aqueous solution containing 0.25% (w/v) bromphenol blue) to the reaction solution. The agarose gel electro- 
phoresis buffer used was a TAE buffer (10 mM Tris, 20 mM acetic acid, 2 mM EDTA). For structural analysis of the 
plasmids, Mupid-2 (Cosmo Bio, KK.) was used for electrophoresis at 100 volts for one hour, and for preparation of the 
DNA fragments, a horizontal gel (20 cm x 15 cm x 0.7 cm) was used for electrophoresis at 150 volts for 4 hours or 35 
volts for 1 3 hours. After staining of the gel for 20 minutes with ethidium bromide aqueous solution (0.5u,g/ml), the DNA 

40 bands were detected with ultraviolet irradiation. The agarose gel concentrations used were 1.0, 1.5 and 2.0% (w/v) 
depending on the size of the DNA fragments to be fractionated. 

The DNA in the agarose gel was eluted by placing the gel in a dialysis tube filled with 0.1 x TAE buffer and applying 
a voltage, or by extraction from the gel using SUPREC-01 (Takara Shuzo, KK.). The DNA solutions were treated with 
phenol and then precipitated with ethanol. 

45 The ligation reaction was conducted adding 10 units of T4 enzyme ligase in 30 uJ of a reaction solution (67 mM 

Tris/HCI (pH 7.5), 5 mM MgCI 2 , 5 mM DTT, 1 mM ATP) containing 0.05-1 u.g of DNA fragments and reacting at 16°C 
for 12 - 18 hours, or using a TAKARA Ligation Kit (Takara Shuzo). 

The transformation of E coli 'was accomplished by the calcium chloride method (competent cells of JM109 were 
purchased for use), and the transformants were selected on the basis of drug resistance (ampicillin or tetracycline). 

50 The transformation of the yeast strain K16-57C was accomplished by the lithium acetate method (METHODS IN YEAST 
GENETICS; A Laboratory Course Manual, Cold Spring Harbor Laboratory Press), and the transformants were selected 
on the basis of complementation of tryptophan auxotrophy. Transformation of strain TK62 has been described by Sakai 
et al. (Sakai etal., J. Bacterid., 173, 7458-7463, 1991). 

Measurement of Kex2 activity was according to the method of Mizuno et al. (Mizuno et al., Biochem. Biophys. Res. 

55 Commun. 156, 246-254, 1988). That is, 100 uJ of Kex2 diluted with 100 mM Tris/HCI (pH 7.0) was added to 100 uJ of 
200 mM Tris/HCI (pH 7.0) solution containing 2 mM CaCfe, 0.2% (w/v) Lubrol and 100 u,M Boc-Leu-Arg-Arg-MCA 
(Peptide Laboratories, KK.), and the mixture was allowed to stand at 37°C for 30 minutes. The reaction was terminated 
by addition of 50 uJ of 25 mM EGTA. The fluorescent intensity of the released fluorescent substance (AMC) was meas- 
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ured using a PANDEX FCA system (Model 10-015-1 of Baxter Travenol (excitation wavelength = 365 nm, base wave- 
length - 450 nm)). The amount of Kex2 which released 1 pmol of AMC in one minute under the conditions described 
above was defined as 1 U. 

The SDS-polyacrylamide electrophoresis (SDS-PAGE) was carried out according to the method of Laemmli (Lae- 
5 mmli et al., Nature 227, 680-685, 1970). That is, a 1/4 volume of 4xSDS sample buffer (375 mM Tris/HCI (pH 6.8), 
30% (v/v) glycerol, 7% (w/v) SDS, 1 5% (v/v) 2-mercaptoethanol, 0. 1 % (w/v) bromphenol blue) was added to the sample, 
and the mixture was heated at 90°C for 5 minutes. A 10 uJ portion was supplied to an SDS-polyacrylamide gel (55 mm 
x 85 mm x 1 mm or TEFCO Co.) for electrophoresis at 20 mA for 80 minutes. After electrophoresis, the gel was stained 
with a staining solution (10% (v/v) acetic acid, 40% (v/v) methanol, 0.25% (w/v) Coomassie brilliant blue R250). 
w The rest of the basic gene manipulation, except where otherwise stated, was conducted according to the method 

described in Molecular Cloning (ed. Maniatis et aL, Cold Spring Harbor, Cold Spring Harbor Laboratory, New York, 
1982). 

Reference Example 1. Preparation of chimeric protein BGal-139S(FM)PPH84 

15 

1) Construction of hProPTH(1-84) gene (Figs. 1 and 2) 

The hProPTH(1 -84) gene was synthesized as the 1 4 fragments U 1 to U7 (SEQ ID NOS.5 to 11 ) and L1 to L7 (SEQ 

I D NOS. 1 2 to 1 8), as shown in F ig. 1 . 
20 The hProPTH(1 -84) gene was constructed by linking each of the fragments in the following manner (Fig. 2). First, 

the DNA fragments U1 (SEQ ID NO.5) and L7 (SEQ ID NO.18) (about 1 ug each) were reacted at 37°C for 1 5 minutes 

in 15 ul of a phosphorylation reaction solution (50 mM Tris/HCI (pH 7.6), 10 mM MgCI 2> 5 mM DTT) containing 16 units 

of T4 polynucleotide kinase and 0.5 nM (over 1 MBq) of (y- 32 P]dATP. To this there was added 5 ul of a phosphorylation 

reaction solution containing 5 mM ATP, and further reaction was conducted at 37°C for 45 minutes. The same procedure 
25 was followed for U2 (SEQ ID NO.6) and L6 (SEQ ID NO.17), U3 (SEQ ID NO.7) and L5 (SEQ ID NO.16), U4 (SEQ ID 

NO.8) and L4 (SEQ ID NO.15), U5 (SEQ ID NO.9) and L3 (SEQ ID NO.14), U6 (SEQ ID NO.10) and L2 (SEQ ID NO. 

13) and U7 (SEQ ID NO.11) and L1 (SEQ ID NO. 12). 

The aforementioned 7 reaction solutions were pooled into one, and ethanol precipitation was performed to recover 

the DNA. This was dissolved in an 80 ^ solution of 100 mM Tris/HCI (pH 7.6), 6.5 mM MgCI 2 and 300 mM NaCL After 
30 allowing 40 u.l thereof to stand at 95°C for 5 minutes, the temperature was lowered to 43°C over 30 minutes. After 

cooling on ice, 40 u.l of ligation B solution (Takara Shuzo, KK.) was added and the mixture was allowed to stand at 

26°C for 15 minutes. 

The sample was subjected to 5% polyacrylamide electrophoresis. After electrophoresis, the linked DNA fragments 
were detected by autoradiography. A DNA fragment corresponding to approximately 280 bp was extracted from the 
35 gel and purified according to an established method. 

2) Construction of the pGal-139S(FM)PPH84-expressing plasmid pGP#19 (Figs. 3 and 4) 

The approximately 280 bp DNA fragment containing the synthetic hProPTH(1-84) gene includes the restriction 
40 enzyme EcoRI site at the 5'-end and the restriction enzyme Sail site at the 3' -end. Cloning of the hProPTH(1-84) gene 
was accomplished by inserting this EcoRl/Sall DNA fragment at the EcoRI/Sall site of pG210ShCT{G]. 

After cleaving pG210ShCT[G] with restriction enzymes EcoRI and Sail, an approximately 3.5 kb DNA fragment 
containing the vector portion was prepared. This was linked with the approximately 280 bp DNA fragment of the 
hProPTH(1-84) gene obtained in 1) above, to obtain plasmid pG210ShProPTH (Fig. 3). pG210ShProPTH was used 
45 to transform E. co// JM109, obtaining JM109[pG210ShProPTH]. 

Also, the linkers KM091 (SEQ ID NO.19) and KM092 (SEQ ID NO.20) were inserted at the restriction enzyme 
Xhol/EcoRI site of pG210ShProPTH, to construct plasmid pG2l0S(S/X) (Fig. 3). This linker has the restriction enzyme 
Xhol and EcoRI sites at either end, and a Sad site between them. 

After digesting plasmid pG210S(S/X) with restriction enzymes Sad and Xhol, a Kilo-Sequence Deletion Kit (Takara 
so Shuzo, KK.) was used for time-dependent specific deletion of the DNA region coding for pGal-21 OS. After modification 
of the ends with Klenow fragment, self-ligation was performed to obtain plasmid pGP#1 9 coding for the chimeric protein 
pGal-1395(FM)PPH84 which has pGal-139S and hProPTH(1-84) linked via Phe-Met (Fig. 4). E. coii JM109 having 
pGP#19 introduced therein is named JM109[pGP#19]. 

55 3) Preparation of chimeric protein pGal-1 39S(FM)PPH84 

JMl09[pGP#1 9] was seeded in a 1 L Erlenmeyer flask containing 200 ml of SB medium and cultured at 37°C with 
shaking for 16 hours. The total preculturing solution was transferred into 3 L of NU medium containing 10u,g/ml tetra- 
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cycline, and aerobically shake cultured at 37°C using a 5 L fermenter (Model KMJ-5B-4U-FP, product of Mitsuwa 
Physicochemical Industries, KK.) The aeration volume was 3 Unin and the shaking speed was adjusted so that the 
amount of dissolved oxygen remained over 2.0 ppm. 

The pH was kept at pH 7 using 9% (v/v) ammonia water and 1 M phosphoric acid. The carbon source provided 
5 was glycerol added at 10 ml per 1 L of culture solution on the 3rd, 9th and 1 4th hours after the start of culturing, and 
the nitrogen source was a 5-fold concentration of SB medium added at 10 ml per 1 L of culture solution at 9.5 hours 
after the start of culturing. An antifoaming agent (Disfoam CC-222, Nihon Yushi, KK.) was added at 300 U.I/L at the 
start of culturing, and was added thereafter as necessary. 

The OD660 after 18 hours of culturing was 55, and about 0.5 mg of the chimeric protein pGal-139S(FM)PPH84 
io was produced per ml of culture solution. The chimeric protein was produced as an insoluble inclusion bodies, which 
was purified in the following manner A 1.5 L portion of culture solution was subjected to centrifugation at 6000 rpm, 
4°C for 10 minutes (20PR-52D, product of Hitachi Laboratory, KK.), and the cells were collected. The cells were sus- 
pended in 320 ml of 100 mM Tris/HCI (pH 7.0) and disrupted with a trench cell pressure press (twice at 10,000 psi). 

The disrupted cell solution was centrifuged at 4000 rpm, 4°C for 15 minutes (05PR-22, product of Hitachi Labo- 
rs ratory, KK.: 50 ml plastic tube, product of Sumitomo Bakelite, KK.). Alter suspending the precipitate in 30 ml of 20 mM 
Tris/HCI (pH 7.0) containing 0.5% (w/w) TritonX-100, the suspension was centrifuged at 3000 rpm, 4°C for 15 minutes, 
and the precipitate was recovered. This procedure was repeated 4 times to obtain the prepurified chimeric protein. 

The purity of the prepurified chimeric protein was approximately 70% (estimated by SDS-PAGE), and the amount 
of protein was about 670 mg (assayed by the Bradford method using bovine serum albumin as the standard). 
20 The prepurified chimeric protein was subjected to high performance liquid chromatography (HPLC: Waters 660E 

by Millipore, KK.) using a YMC Packed column (2 cm x 25 cm, product of Yamamura Chemical Laboratory) for purifi- 
cation. The chimeric protein was eluted with a linear concentration gradient of acetonitrile (A: 0.1% (vA/) trifluoroacetic 
acid (TFA); B: 0.1% (v/v) TFA/80% (v/v) acetonitrile; %B = 30%-60%/60 minutes, flow rate = 10 ml/min). Each of the 
fractions was subjected to SDS-PAGE, and the fractions of 95% purity or greater were pooled and lyophilized. 
25 The lyophilized chimeric protein was again dissolved in 0.1% (v/v) TFA, and then subjected to HPLC for further 

purification. (The conditions were the same except that the gradient was %B = 40%-60%/60 minutes.) The fractions 
of 99% purity or greater were collected based on the index of absorbance at 21 0 nm according to analysis by analyzing 
HPLC, and were lyophilized into a standard. The amount of protein in the standard was estimated from amino acid 
analysis. 

30 

Reference Example 2. Preparation of the soluble chimeric protein CATPH34 

1) Construction of the CATPH34-expressing plasmid ptacCATPTH(1-34) (Figs. 5 and 6) 

35 An R4 linker (R4U: SEQ ID NO.21 and R4L: SEQ ID N0.22) was inserted at the restriction enzyme EcoRI-Xhol 

site of pG97S4DhCT[G] to construct pG97S4DhCT[G]R4. The PTH(1-34) gene prepared by PCR and the proa linker 
described below (proaU: SEQ ID NO. 23 and proaL: SEQ ID NO.24) were inserted at the restriction enzyme Xhol-Kpnl 
site of the obtained plasmid pG97S4DhCT[G]R4, to construct pPTH(1 -34)proct. The PTH(1 -34) gene was prepared by 
PCR with pGP#19 as the template, using primers P1 (SEQ ID NO.25) and P2 (SEQ ID NO.26) (Fig. 5). 

to Next, primers CAT1 and CAT3 (SEQ ID NOS.27 and 28) were synthesized in order to insert the restriction enzyme 

Xhol site at the 3'-end of the CAT (chloramphenicol acety transferase) gene. The CAT gene having the restriction 
enzyme Xhol site inserted at the 3"-end thereof was obtained by PCR using CAT1 and CAT3 as the primers and ptacCAT 
as the template DNA. This was digested with restriction enzymes Ncol and Xhol, after which the ptacCAT-de rived Sail- 
Ncol DNA fragment (3.6 kbp) and the pPTH(1 -34)p root-derived Xhol-Sall DNA fragment (0.15 kbp) were linked to 

45 construct ptacCATPTH(1 -34) (Fig. 6). 

2) Preparation of chimeric protein CATPH34 (see Fig. 7). 

Strain JM101 containing ptacCATPTH(1-34) was cultured in LB medium at 37°C. IPTG (isopropyl betathiogalac- 
50 toside) was added to a final concentration of 2 mM when the OD660 value of the culture solution reached 0.6, and 
culturing was continued for 3 hours to produce the chimeric protein CATPH34. After completion of the culturing, cen- 
trifugation (8000 rpm, 20 minutes) was performed to collect the cells, and solution A (50 mM Tris/HCI (pH 8.0), 2 mM 
EDTA, 0.1 mM 2-mercaptoethanol, 0.1 mM PMSF) was added until the OD660 value of the suspension reached 70. 
Next, 3 ml of the cell suspension was subjected to ultrasonic treatment, and after disruption of the cells, the soluble 
55 fraction was separated by centrifugation (1 2,000 rpm, 10 minutes), and applied to a chloramphenicol caproate (Sigma 
C-8899) column (3 ml) equilibrated with solution A. After washing the column with solution A containing 1 M NaCI, the 
chimeric protein was eluted with solution A containing 10 mM chloramphenicol and 1 M NaCI. Fig. 7 shows the results 
of SDS-PAGE for samples before and after purification. Lane 1 is the molecular weight marker, lane 2 is the soluble 
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fraction after cell disruption, and lane 3 is the chimeric protein CATPTH(1-34) after purification. The numbers to the 
left of lane 1 indicate the sizes of the molecular weight markers (kDa). 

The chimeric protein was produced in the soluble fraction, and was easily purified by affinity chromatography using 
chloramphenicol caproate. 

5 

Reference Example 3. Preparation of insoluble chimeric protein 0GaM17S4HPPH34 

1) Construction of the pGal-117S4HPPH34-expressing plasmid pG117S4HPPH34 (Figs. 8 to 10) 

io pGP#19 was used as the template and S01 (SEQ ID NO.29) and S02 (SEQ ID NO.30) as primers for PCR to 

amplify a DNA fragment in which the 35th codon GTT of hPTH(1-B4) was replaced with the translation termination 
codon TAA, after which the restriction enzyme Aatil-Sall DNA fragment was isolated and purified by common methods 
and exchanged with the corresponding portion of pGP#1 9 to construct pGP#19PPH34 (Fig. 8). 

Next, a DNA fragment obtained by amplification by PCR using pG2lOS(S/X) as the template and S03 (SEQ ID 

is NO. 31 ) and S05 (SEQ ID NO. 32) as the primers followed by digestion with restriction enzymes Sail and Smal, a DNA 
fragment obtained by amplification by PCR using pGP#19PPH34 as the template and S07 (SEQ ID NO.33) and S02 
(SEQ ID NO.30) as the primers followed by digestion with restriction enzymes Sail and Smal and a restriction enzyme 
Pvul-Sall DNA fragment containing the replication initiation origin of pGP#1 9PPH34 were linked with T4 ligase, to 
construct pG117SPPH34 (Figs. 9 to 10). 

20 Linkers SOB (SEQ ID NO.34) and S09 (SEQ ID NO.35) coding for (His) 4 -Pro-Gly were inserted at the restriction 

enzyme Smal site of pG117SPPH34 to construct pG117S4HPPH34 (Fig. 10). The orientation of the linkers was con- 
firmed by determining the DNA nucleotide sequences after preparing the plasmids. 

2) Production of chimeric protein pGal-117S4HPPH34 

25 

To obtain the chimeric protein pGal-117S4HPPH34 in a large amount, E. coli M25[pG117S4HPPH34] in which the 
expression vector for the chimeric protein had been introduced was cultured at 37°C in a 20 L SB2 medium. With a 
cell concentration of OD660 = 1 .0, IPTG was added to a final concentration of 1 mM, and culturing was continued until 
the cell concentration reached OD660 = 1 2. Disfoam CC-222 (product of Nihon Yushi, KK.) was used as an antifoaming 
30 agent. After collecting the cells, they were suspended in TE (10 mM Tris/HCI, 1 mM EDTA, pH 8.0), and this was 
followed by cell disruption with a high-pressure homogenizer (Manton-Gaullin), centrifugation, and suspension and 
washing with TE and deionized water, to obtain about 100 g of an inclusion bodies. 

Example 1 . Expression of secretory Kex2 derivatives in Saccharomyces cerevisiae 

35 

For purification of a large amount of enzymes with Kex2 protease activity, not only must the yields be high, but the 
purification thereof must also be simple, and for this purpose the inventors considered it advantageous to carry out 
secretion in a culture solution containing few other proteins. Although ssKex2 has been reported as a secretory Kex2 
derivative, its yield is 4 mg/L culture solution which is too low for use on an industrial scale. Thus, different secretory 
40 Kex2 derivatives were constructed first, and were expressed in Saccharomyces cerevisiae to investigate the secretion 
yields, selecting among them the Kex2 derivatives with the greatest secretion yields. 

1 ) Construction of secretory Kex2 derivative-expressing plasmids (Figs. 11,12 and 1 3) 

45 The secretory KEX2 gene was constructed by the PCR. The primer sequence is shown in Fig. 1 1 (b). KM085 (SEQ 

ID N0.36) has the restriction enzyme EcoR! site (underlined) at the 5'-end, and KM088 (SEQ ID NO.37), KM089 (SEQ 
ID NO.38), KM090 (SEQ ID NO.39) and KM093 (SEQ ID NO.40) have the restriction enzyme Sail site (underlined) at 
their 5'-ends. 

These primers correspond to the KEX2 gene region shown in Fig. 11(a), with KM085 including a nucleotide se- 
so quence coding for the initial methionine of the KEX2 gene, and KM088, KM089, KM090 and KM093 having nucleotide 
sequences which are antisense to sequences in which the translation termination codon TAA is added directly to the 
660th, 679th, 688th and 699th amino acids from the N-terminus, respectively. 

A PCR reaction was conducted using plasmid pYE-KEX2(5.0)b, cut with restriction enzyme EcoRI and in linear 
form as template, using KM085 and KM088 as primers. The reaction purification product was cleaved with restriction 
. ss enzymes EcoRI and Sail to obtain an EcoRI-Sali DNA fragment. This DNA fragment has the DNA nucleotide sequence 
coding for Kex2-660 (KEX2-660), with the restriction enzyme EcoRI site upstream and the restriction enzyme Sail site 
downstream. 

Next, after cleaving plasmid pYE-22m with restriction enzymes EcoRI and Sail, the approximately 8.3 kb DNA 
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fragment containing the vector portion was purified. This was linked with the EcoRI-Sall DNA fragment containing the 
gene coding for Kex2-660 obtained earlier, to obtain plasmid pYE-660 (Fig. 1 2). 

In the same manner, KM089, KM090 and KM093 were used instead of the primer KM088, the EcoRI-Sall DNA 
fragments containing nucleotide sequences coding for Kex2-679, Kex2-688 and Kex2-699 (KEX2-679, KEX2-688, 
s KEX2-699) were recovered and linked with the EcoRI-Sall fragment of plasmid pYE-22m, to obtain plasm ids pYE-679, 
pYE-688andpYE-699. 

Plasmid pYE-614 was constructed by replacing the Bglll-Sall DNA fragment containing a portion of the KEX2 gene 
of pYE-KEX2 (Rl-Pvull) with the Bglll-Sall DNA fragment containing a portion of the KEX2-660 gene of pYE-660 (Fig. 
13). 

JO 

2) Transformation and expression of secretory Kex2 derivatives (see Figs. 14 and 15) 

The plasmids (pYE-22m, pYE-614, pYE-660, pYE-679, pYE-688 and pYE-699) were each introduced in strain 
K16-57C to obtain strains K16-57C[pYE-22m] t Kl6-57C[pYE-6l4], K16-57C[pYE-660], Kl6-57C[pYE-679], K16-57C 

15 [pYE-688] and Kl6-57C[pYE-699]. 

The amounts of Kex2 derivative secretion in the culture solutions were determined by assay of Kex2 activity in the 
culture supernatants and SDS-PAGE of their concentrates. 

The colonies were seeded into 4 ml of YCDP medium and then cultured overnight at 32° C with shaking. After 
transferring 100 u.l of culture solution to 4 ml of YCDP medium, it was cultured overnight at 32°C with shaking. One ml 

20 of the culture solution was centrifuged at 12,000 rpm, 5 minutes, 4°C (MRX-150, Tomy Seiko) to obtain the culture 
supernatant. After diluting the culture supernatant 2- to 64-fold with 100 mM Tris/HCI (pH 7.0), the Kex2 activity was 
measured. The results are shown in Fig. 14. The Kex2 activities per OD660 of Kl6-57C[pYE-660], K16-57C[pYE-679] 
and K16-57C[pYE-688] were 25, 15 and 1.2 times greater, respectively, than that of Kl6-57C[pYE-6l4]. No Kex2 
activity was detected in the culture supernatants of K16-57C[pYE-22m] and Kl6-57C[pYE-699]. 

25 The samples for SDS-PAGE were prepared by concentrating the culture supernatants to 20-fold using an Ultraf ree- 

C3GC (Millipore, KK.; fractionation molecular weight = 10,000), and approximately 200 u,l of culture supernatant was 
used per lane. The results are shown in Fig. 15. Lanes 1 and 7 are molecular weight markers, lane 2 is for K16-57C 
[pYE-22m], lane 3 is for Kl6-57C[pYE-614], lane 4 is for Kl6-57C|pYE-660], lane 5 is for K16-57C[pYE-679J and lane 
6 is for K16-57C[pYE-688], The numbers to the left of lane 1 indicate the sizes of the molecular weight markers (kDa). 

30 it was shown that Kex2-660 and Kex2-679 had greater secretion amounts than Kex2-61 4, similar to their activities. 

It was also shown that their molecular weights increased correspondingly with the number of amino acid residues, i. 
e. that no autolysis accumulated in this culturing as occurs with Kex2Ap in the insect cell host Sf9. 

The secretory yields of Kex2-660 and Kex2-679 were found to be much greater, at least 10 times greater, than the 
secretory yield of Kex2-61 4 hitherto reported. Also, since no notable autolysis was observed in the cultu re supernatants, 

35 even higher yields may be expected by investigating other methods of production. 

Example 2. Purification of Kex2-660 

Culturing of K16-57C[pYE-660) was carried out on a greater scale, with the object of purifying Kex2-660 from the 
40 culture supernatant. 

Kl6-57C[pYE-660] was cultured overnight at 32°C in 3 L of YCDP medium. A 2.3 L portion of the culture super- 
natant was subjected to concentration and exchange of the buffer solution (20 mM Bis-Tris/HCI (pH 6.0), 50 mM NaCl, 
0.2 mM CaCI 2 ), using an ultrafiltration module (UF-LMSII System: UF2CS-3000PS, Toso, KK.) (final volume: 275 ml). 
A 210 ml portion thereof was adsorbed onto a OSepharose XK16 (Pharmacia, KK.) column equilibrated before- 

45 hand with the same buffer solution. After washing with 75 ml of the same buffer solution, elution (1 20 ml) was performed 
in the same buffer solution with a linear concentration gradient of 50 to 350 mM NaCl concentration. The flow rate was 
3 ml/min. Kex2 activity was recovered in 24 ml of eluted fractions of 150 to 250 mM NaCl concentration. After adding 
6.6 g of ammonium sulfate to the eluate, 1 N HCI was used to adjust the pH to 6.0, and the volume was increased to 
30 ml with 20 mM Bis-Tris/HCI (pH 6.0), 0.2 mM CaCl 2 . 

50 A 1 5 ml portion thereof was adsorbed onto a Phenyl SuperoseHR 5/5 (Pharmacia) column equilibrated beforehand 

with a 20 mM Bis-Tris/HCI (pH 6.0), 0.2 mM CaClg solution containing 2 M ammonium sulfate. After washing with 2.5 
ml of the same buffer solution, elution (15 ml) was performed in 20 mM Bis-Tris/HCI (pH 6.0), 0.2 mM CaCI 2 with a 
linear concentration gradient of 2 to 0 M ammonium sulfate. The flow rate was 0.5 ml/min. Kex2 activity was recovered 
in 2.25 ml of eluted fractions of 0.8 to 0.6 M ammonium sulfate concentration. The recovery rates of Kex2-660 at each 

55 stage are shown in Table 1 . The overall recovery rate was determined by integrating the recovery rates at each stage. 
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Table 1 



Kex2-660 purification 


Stage 


Activity (xlO^U/ml) 


Recovery rate (%) 


*1 


*2 : 


Culture supernatant 


9.3 




100 


0.22 um filtration 


9.2 


99 


99 


Ultrafiltration 


50 


99 


97 


Q-Sepharose 


250 


63 


61 


Phenyl Superose 


1,740 


93 


57 



*1 : Recovery rate at each stage 



*2: Recovery rate from culture supernatant sample 



Example 3. Effect of urea on Kex2 protease activity of Kex2-660 

In chimeric protein expression process, the chimeric protein usually forms insoluble inclusion bodies, and thus 
urea or the like is used as a denaturing agent to solubilize it. The effect of urea on the activity of Kex2 protease and 
secretory Kex2 derivatives has not been reported. Thus, we determined the effect of urea on protease activity of 
Kex2-660 using Boc-Leu-Arg-Arg-MCA and the chimeric protein pGal-1 39S(FM)PPH84 as substrates. 

1) Effect of urea on Kex2 protease activity of Kex2-660 with synthetic substrate 

The activities of the Kex2-660 purified in Example 2 (adjusted to a final concentration of 80 to 1 200 Uyml) with final 
urea concentrations of 0, 1,2 and 4 M was investigated using Boc-Leu-Arg-Arg-MCA. The reaction conditions were 
the same as the Kex2 activity assay method described earlier, except for urea. The activities at urea concentrations 
of 1, 2 and 4 M were found to be 70%, 40% and 10%, respectively, with respec-t to 100% activity in the absence of 
urea (Fig. 16). For activation of protease, etc. after dissolving the insoluble inclusion bodies in urea solution, a 2 to 4 
M urea concentration is generally used. Thus, it was concluded that Kex2-660 can be used for excision of desired 
peptides from chimeric proteins, if the dissolution conditions of the chimeric proteins are appropriate determined. 



2) Effect of urea on Kex2 protease activity of Kex2-660 with protein substrate 

The effect of urea on the protease activity of Kex2-660 was investigated using the |5Gal-1 39S(FM)PPH84 prepared 
in Reference Example 1 and the Kex2-660 purified in Example 2. First, reaction was conducted at 37° C for 30 minutes 
under the reaction conditions described below, using urea concentrations of 1 .5 to 3.0 M. After adding a 4-fold volume 
of 5 N acetic acid, 50 uJ of the reaction solution was subjected to high performance liquid chromatography (HPLC: 
LC6A, product of Shimazu Laboratories) using a YMC-ODS-A302 column (d 4.6 mm x 150 mm, product of Yamamura 
Chemical Laboratories), and eluted with a linear concentration gradient (A: 0.1% (v/v) trifluoroacetic acid (TFA); B: 
0.094% (v/v) TFA/80% (v/v) acetonitrile; %B = 30%-60%/30 minutes, flow rate = 1 ml/min). 



2 mg/ml 


PGal-139S(FM)PPH84 


100 mM 


Tris/HCI (pH 7.0) 


1.5 to 3.0 M 


urea 


1 mM 


CaCI 2 


50 kU/ml 


Kex2-660 



The peaks newly appearing after Kex2-660 processing were divided, and identification and amino acid composi- 
tional analysis of the amino acid sequence from the N-terminus identified |JGal(1-14), hPTH(1-84) and hPTH(1-44). 
pGal-1 39S(FM)PPH84 has 4 sites of sequences predicted to be cleaved by Kex2 protease: Arg-Arg (cleavage site A), 
Lys-Arg (cleavage site B) and Pro-Arg (cleavage sites C and D). Cleavage at cleavage sites A, B and C was confirmed 
from the identified peptide fragments, but cleavage at cleavage site D could not be confirmed. 

The recovery rates for each of the peptide fragments produced after Kex2-660 processing were determined in the 
following manner. 
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Recovery rate(%) = FPA x CAA x 100/(CPA x FAA) 

FPA: Peak area for each peptide fragment after Kex2-660 processing 
CAA: Number of amino acids of PGaM39S(FM)PPH84 (231 amino acids) 
s CPA: Peak area for PGal-1 39S(FM)PPH84 before Kex2-660 processing 

FAA: Amino acid number of each peptide fragment 

The results are shown in Fig. 17. The open squares, open circles, solid circles and open triangles represent, 
respectively, recovery rates for pGal(1-14), hPTH(1-84), hPTH(1-44) and hPTH(1-84) + hPTH(1-44). 
10 it was shown that in the urea concentration range of 1 .5 to 2.5 M, an increase in urea concentration resulted in a 

higher recovery rate of the peptide excised by Kex2-660, i.e. an increase in the cleavage efficiency at cleavage sites 
A, B and C. 

On the other hand, it was found that in the urea concentration range of 2.5 to 3.0 M the recovery rate for pGal 
(1-14) and hPTH(1 -44) fell while the recovery rate for [hPTH(1-84) + hPTH(1-44)] was virtually unchanged and that of 
is hPTH(1 -84) increased; thus the cleavage efficiency at cleavage site B was unchanged while the cleavage efficiency 
at cleavage sites A and C fell. In other words, this showed that with cleavage of pGal-139S(FM)PPH84 by Kex2-660, 
increasing urea concentrations of up to 2.5 M give higher cleavage efficiency, while at urea concentrations of 2.5 to 
3.0 M differences in cleavage efficiency occur depending on the sequence. 

The following experiment was then conducted with higher urea concentrations of 3.0 to 4.0 M, to determine the 
20 peptide fragment recovery rates, etc. 



2 mg/ml 


pGal-139S(FM)PPHS4 


100 mM 


Tris/HCI (pH 7.0) 


3.0 to 4.0 M 


urea 


1 mM 


CaCI 2 


20 kU/ml 


Kex2-660 



The results are shown in Fig. 18. The symbols and calculations for the recovery rates were as explained above. 
3 o As a result, no notable difference was observed in the recovery rates of any of the fragments at urea concentrations 

of 3.0 to 4.0 M, although increasing urea concentrations provided lower recovery rates for pGa1(1 -1 4) and hPTH(l -44), 
and a tendency for increased hPTH(1 -84) was seen. That is, it was found that with increasing urea concentration the 
cleavage efficiency at cleavage site B was slightly greater, while the cleavage efficiency at cleavage site C was reduced. 
To summarize the results for cleavage of pGal-1 39S(FM)PPH84 by Kex2-660 at urea concentrations of 1 .5 to 4.0 
3S M, it was found that increasing urea concentration within a urea concentration range of 1.5 to 2.5 M gives greater 
cleavage efficiency at cleavage sites A, B and C, and in the urea concentration range of 2.5 to 3.5 M the cleavage 
efficiency at cleavage site B increases while the cleavage efficiency at cleavage site C decreases. These discoveries 
were not predictable from using synthetic substrates, and were first arrived at by the present invention. 

From the standpoint of excision of hPTH(1-84) from chimeric proteins, it became clear that a urea concentration 
40 of 3.5 to 4.0 M is preferred at which the cleavage efficiency at cleavage site B is high and the cleavage efficiency at 
cleavage site C is low. 



Example 4. Excision of hPTH(1-84) from &Gal-139S(FM)PPHB4 with Kex2-660 



so 



55 



In Example 3, pGal(1-14), hPTH(1-84) and hPTH(1-44) were identified and the effect of urea on the cleavage 
efficiency of pGal-139S(FM)PPH84 by Kex2-660 was investigated based on their peptide recovery rates. However, 
peptide fragments from hPTH(45-B4) could not be identified. Thus, a sample of hPTH(1-84) processed with Kex2-660 
was separated and analyzed by HPLC under various elution conditions to identify hPTH(45-84), and it was confirmed 
that it had eluted out in the fraction which had passed through in elution under the previous conditions. These results 
demonstrated that cleavage site D undergoes virtually no cleavage. 

Next, different proportions of Kex2-660 (25 kU, 50 kU, 100 kU, 150 kll and 200 kU per 1 mg of chimeric protein) 
were allowed to act on the chimeric protein, and after 30 minutes the recovery rates of each of the peptide fragments 
were investigated. - The analysis of the resulting peptide fragments was carried out in the same manner as in Example 
3, except that based on the results mentioned above, the gradient conditions were changed to %B = 0%-»80%/80 
minutes. 



1 mg/ml 



PGal-1 39S(FM)PPH84 
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(continued) 



10 



25 



30 



35 



40 



45 



50 



55 



100 mM 


Tris/HCI (pH7.0) 


4M 


urea 


1 mM 




25 to 200 kU/ml 


Kex2-660 



Fig. 19 shows an elution profile of HPLC for samples unprocessed with Kex2-660 and processed with 50 kU of 
Kex2-660. Peaks 1 , 2, 3, 4 and 7 correspond, respectively, to hPTH (45-84) (from the amino acid after cleavage site C 
to the C-terminus of hPTH(1-84)), pGal(1-14) (from the N-terminus of pGal-139S to cleavage site A), hPTH(l-84), 
hPTH(1-44) (from the amino acid after cleavage site B to cleavage site C) and pGal-139S(FM)PPH84 (full length of 
the chimeric protein). 

Peaks 6 and 7 were larger when the amount of Kex2-660 was smaller and smaller when the amount was larger, 
and since peak 5 increased as peaks 6 and 7 decreased, it was concluded that peak 5 was a peptide from the amino 
acid after cleavage site A to cleavage site B of the chimeric protein, and peak 6 was a peptide from the N-terminus of 
pGal-139 to cleavage site B or from the amino acid after cleavage site A to the C-terminus of hPTH(1-84). (Peak 6 
may possibly be from the N-terminus of pGal-139S to cleavage site C, but this is unlikely since Arg-Arg is more easily 
cut than Pro-Arg.) 

Also, the sizes of peaks 1 , 4, 5, 6 and 7 varied in the range of 25 to 200 kU/ml, while the sizes of peaks 2 and 3 
were virtually unchanged. Even when 200 kU/ml of Kex2-660 was used, no new peaks appeared. In other words, it 
was confirmed that no protease activity other than that of Kex2 protease is detected even when using 8 times (200 
kU/ml) the necessary amount of Kex2-660 (25 kU/ml) for excision of hPTH(1-84), and that the Kex2-660 purified in 
Example 2 had no contamination by other interfering proteases under conditions at which hPTH(1 -84) is excised from 
chimeric proteins. 

The recovery rates for the peptide fragments derived from hPTH(1 -84) in the range of 25 to 200 kU/ml are sum- 
marized in Fig. 20. It is clear that when 50 kU/ml of Kex2-660 was used, hPTH(1-84) was recovered at about 75%. 
Here, although about 1 0% of the pGal-1 39S(FM)PPH84 remained, it decreased as the amount of Kex2-660 increased, 
almost totally disappearing at 200 kU/ml. However, the proportion of hPTH(1-44) also increased simultaneously, and 
thus the recovery rate of hPTH(1-84) did not increase. Even when the amount of Kex2-660 was raised, the increase 
in the hPTH(1-84) decomposition products hPTH(1-44) and hPTH(45-84) was mild, and the recovery rate of hPTH 
(1 -84) was 65-75% in the range of 25 to 200 kU/ml. 

These results demonstrate that Kex2-660 is capable of excising desired peptides from chimeric proteins in an 
efficient manner (an excision efficiency of 75%) even when the desired peptide has a cleavage site for Kex2 protease. 
This excision efficiency is higher than the excision efficiency of 50% for hPTH(1-84) using factor-Xa (Gardella et al., 
J. Biol. Chem. 265(26), 15854-15859, 1990). Gardella etal. have suggested the possibility that contaminating proteases 
or factor-Xa itself degrades hPTH(1-84), judging from lower hPTH(1-84) recovery rates when the enzyme amount is 
increased or the reaction time is extended, despite the fact that hPTH(l -84) does not include the factor-Xa recognition 
site, i.e. the lle-Glu-Gly-Arg sequence. 

The fact that hPTH(1-84) is obtained at a high yield despite the fact that hPTH(1-84) includes 2 sites of cleavage 
sequences for Kex2 protease suggests that the purified Kex2 derivatives with increased yields according to the inven- 
tion are useful as enzymes for excision of desired peptides from chimeric proteins. 

Furthermore, since there was no detection of any other peptide fragments produced by cleavage at sites other 
than the Kex2 protease site even when the amount of Kex2-660 was increased, the substrate specificity of the Kex2 
purified in Example 2 is high, while no other protease activity was detected under conditions at which desired peptides 
are excised from chimeric proteins. 

Example 6. Excision of hPTH(1-34) from CATPH34 by Kex2-660 

In order to excise hPTH(1 -34) from the chimeric protein CATPH34, 239 u,l of deionized water, 1 .32 u.l of 1 M CaCI 2 
and 30 kU of Kex2-660 were added to 60 ul taken from the eluate from Reference Example 2, and the mixture was 
heated at 37°C for one hour. After the reaction the appearing peaks were examined for amino acid analysis, and the 
amino acid composition was found to match that of hPTH(1 -34) (Fig. 21 ). 

That is, it was shown that no other protease activity is detected even with an absence of urea in the reaction 
solution, and thus that Kex2-660 can be used for excision of hPTH(1-34)from chimeric proteins. Furthermore, Kex2-660 
is able to excise desired proteins even when the chimeric proteins have different protective peptides and cleavage site 
regions, and thus it has wide industrial application. 
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Example 7. Excision of hPTHH-34) from chimeric protein BGaH17S4HPPH34 

To 250 ml of the pGal-117S4HPPH34 inclusion bodies suspension (160 g/L) prepared in Reference Example 3 
there were added 100 ml of 1 M Tris/HCI (pH B. 2), 50 ml of 5 M NaCi, 500 ml of deionized water and 900 g of urea, 
s and after stirring to dissolution for 30 minutes in a constant temperature bath at 30° C, the solution was diluted with 
warmed deionized water to 5 L at 30°C. 

To this there was gently added 50 ml of a 250 mM CaC^ solution while stirring, and then Kex2-660 was added to 
20 kU/ml. After 2 hours, 7 g of hPTH(1-34) was excised at an efficiency of over 90%. The amount of Kex2-660 used 
was less than 1/20,000 ot the chimeric protein (weight ratio), and this demonstrated that hPTH(1-34) was efficiently 
10 excised from the chimeric protein. 

Example 6. Expression of secretory Kex2 derivatives in Candida boidinii 

The results of Example 1 demonstrated that Kex2-660 undergoes no notable autolysis in culture solution. Thus, 
*5 greater yields may be expected by using more efficient expression systems. We therefore attempted to produce 
Kex2-660 in an expression system using the methanol-utilizing yeast Candida boidinii as the host. 

1 ) Construction of expression plasmid using Candida boidinii (Figs. 11 and 22) 

20 The NKEX2-660 gene was constructed by the PCR in the same manner as Example 1,1), except that NKEX2 

(SEQ ID NO.41) and KM088 were used as the primers. NKEX2 contains the nucleotide sequence of the restriction 
enzyme Notl site (underlined) at the 5' -end, and includes the sequence of bases -107 to -132 upstream from the initial 
methionine of the KEX2 gene (Fig. 11 ). The NKEX2-660 gene (the KEX2 gene with 1 32 base pairs of the 5' untranslated 
region of the KEX2 gene) was cloned in pCRIl and then excised using restriction enzyme Notl. The Notl DNA fragment 

2S containing the NKEX2-660 gene was inserted at the Notl site of plasmid pNOTell under promoter control to allow 
expression of the KEX2-660 gene, to thus construct pCU660 (Fig. 22). 

2) Production of secretory Kex2 derivative with Candida boidinii (Fig. 23) 

30 Plasmid pCU660 which had been digested with restriction enzyme BamHI on the URA3 gene and in linear form 

was introduced into TK62, and the transformed strains TK62/pCU660 were selected. Twenty TK62[pCU660] strains 
(#1 to #20) were then cultured at 27°C with shaking in BMGY medium. After 2 days, approximately 10 OD-ml of the 
preculturing solution was transferred into 1 ml of BMMY medium and further cultured at 27°C with shaking. After 30 
hours, the Kex2 activity of the culture supernatant was measured. The 5 strains with the highest activity were cultured 

35 in the same manner, and TK62[pCU660]#10 which had reproducible high Kex2 activity was selected and cultured in 
a fermenter. 

After transferring 1 ml of glycerol-frozen stock TK62[pCU660]#10 into a 300 ml Erlenmeyer flask containing 25 ml 
of YPD medium, it was precultured at 27°C for 16 hours. A 10.5 ml portion of the preculture solution (OD600 = 38) 
was transferred into 2 L of YPGM culturing medium, and a 5 L fermenter (Model KMJ-5B-4U-FP, of Mitsuwa Rika) was 

40 used for culturing at 27° C. The aeration volume was 4 L/min and the stirring speed was adjusted so that the amount 
of dissolved oxygen remained over 2.5 ppm. The methanol, glycerol and nitrogen source (5% (w/v) yeast extract, 10% 
(w/v) peptone, 6.7% (v/v) YNB w/o AA; 1/25 volume/addition) were supplemented as appropriate. 

The pH was controlled to remain above pH 5.5 by adding 7.5% (v/v) ammonia water. An antifoaming agent (Disfoam 
CC-222, Nihon Yushi, KK.) was added at 0.5 ml/L, at the start of culturing and was added thereafter as necessary. The 

45 results of SDS-PAGE for the culture supernatant at each culturing time are shown in Fig. 23. The OD600 after 48 hours 
of culturing was 353, and this culturing produced about 2800 MU of Kex2-660 (corresponding to about 340 mg) per 1 
L of culture supernatant. 

This yield is capable of excision of 200 g of hPTH{1 -34) in Example 7, and thus it was shown that the Kex2 derivative 
of the invention can- be practically used for excision of desired peptides from chimeric proteins on an industrial scale. 

so 

Example 9. Expression of secretory Kex2 derivative in Saccharomyces cerevisiae (2) 

in Example 1 it was demonstrated that the yields of Kex2 proteases lacking the C-terminal region (Kex2-660 and 
Kex2-679) were notably higher than those of Kex2-614, Kex2-699 and Kex2-688. In this example, additional Kex2 
55 proteases lacking the C-terminal region (Kex2-630, Kex2-640, Kex2-650 and Kex2-682) were constructed, to further 
investigate the relationship between the C-terminal region and Kex2 protease yields. 
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1 ) Construction of secretory Kex2 derivative-expressing plasmkte 

The secretory Kex2 genes were constructed by the method in Example 1,1). Specifically, the primer sequences 
KM100 (SEQ ID NO.42), KM102 (SEQ ID NO. 43), KM103 (SEQ ID N0.44) and KM104 (SEQ ID NO.45) have nucleotide 
5 sequences which are antisense strands to sequences in which the translation termination codon TAA is added directly 
after the 630th, 640th, 650th and 682nd amino acids from the N-terminus, respectively. 

Construction of EcoRI-Sall DNA fragments coding for the secretory Kex2 derivative genes and the expression 
vectors containing them was accomplished by the method in Example 1,1). The polypeptides from the N-terminus of 
Kex2 protease to the 630th, 640th, 650th and 682nd amino acids were named Kex2-630, Kex2-640, Kex2-650 and 
io Kex2-682, and the genes coding for them were named KEX2-630, KEX2-640, KEX2-650 and KEX2-682, respectively 

2) Transformation and production of secretory Kex2 derivatives (see Figs. 25 and 26) 

Plasmids (pYE-630, pYE-640, pYE-650 and pYE-682) were introduced into strain K16-57C to obtain strains 
is Kl6-57C[pYE-630], K16-57C(pYE-640], K16-57C[pYE-650] and K16-57C(pYE-682]. These transformants were then 
cultured, with the secretory Kex2 derivative-producing strains prepared in Example 1 , 2), and the Kex2 yields (amount 
of Kex2 secreted into the culture medium) were determined by measurements of Kex2 activity in the culture superna- 
tants and SDS-PAGE of the culture supernatant concentrates. 

The colonies were inoculated into YCDP medium and then cultured at 30°C with shaking to prepare cells in the 
20 logarithmic growth phase. These cells were subcultured in YCDP medium until the OD660 absorbance reached ap- 
proximately 0.02, and then further cultured for about 16 hours at 30°C. The results of Kex2 activity measurement are 
shown in Fig. 25, and the results of SDS-PAGE are shown in Fig. 26. 

The Kex2 activities perOD660 in culture supernatant for K16-57C[pYE-630], K16-57C[pYE-640], Kl6-57C[pYE- 
650], K16-57C[pYE-660] and Kl6-57C[pYE-679] were roughly 12 times that of K16-57C[pYE-614], thus showing no 
25 difference in Kex2 yields for the range of KEX2-630 to KEX2-679. 

In addition, the Kex2 yields for K16-57C[pYE-682] and K16-57C[pYE-6S8] were, respectively, 6 times and 3.4 
times that of Kl6-57C[pYE-614], i.e. they were lower the longer the C-terminal region (Fig. 25). No Kex2 activity was 
detected in the culture supernatants of Kl6-57C[pYE-22m] and K16-57C[pYE-699]. 

Also, the results of SDS-PAGE demonstrated that the yields for Kex2-630, Kex2-640, Kex2-650, Kex2-660, 
30 Kex2-679, Kex2-682 and Kex2-688 were also greater than for Kex2-614, similar to the Kex2 activities (Fig. 26). 

Example 10. Expression of secretory Kex2 derivative in Pichia pastoris 

The results of Example 8 showed that the secretory Kex2 derivative Kex2-660 can be produced in large amounts 
35 in an expression system using Candida boidinii as the host. The.possibility of producing Kex2-660 with other methyl- 
otrophic yeast was investigated using an expression system with Pichia pastoris as the host {Pichia Expression Kit, 
Invitrogen Co.). 

1) Construction of secretory Kex2 derivative-expressing plasmid in Pichia pastoris host (see Figs. 11 and 27) 

40 

A PCR reaction was conducted using plasmid pYE-660, digested with restriction enzyme EcoRI and in linear form, 
as the template and KM085 (SEQ ID NO.36) and KM088 (SEQ ID NO. 37) as primers. The reaction purification product 
was cloned in pCRII (Invitrogen Co.). The resulting plasmid was digested with restriction enzyme EcoRI, to obtain a 
DNA fragment consisting of the KEX2-660 gene with the restriction enzyme EcoRI site at both ends. 
45 The DNA fragment containing the KEX2-660 gene was inserted at the restriction enzyme EcoRI site of plasmid 

pHIL-D2 (Pichia Expression Kit), to obtain plasmid pHIL-660 having the KEX2-660 gene inserted in an orientation 
allowing its expression under AOX promoter (Fig. 27). 

2) Production of secretory Kex2 derivative in Pichia pastoris 

so 

A fragment containing the KEX2-660 gene obtained by digesting plasmid pHIL-660 with restriction enzyme Notl 
was introduced into Pichia pastoris GS11 5 (his-, AOX + , Pichia Expression Kit), and the GS1 15[pHIL-660] transformants 
which grew in medium containing no histidine were selected. From these were obtained 5 GS115[pHIL-660](AOX-) 
strains which could not grow with methanol alone as the carbon source. These were then cultured in BMMY medium, 
55 to obtain GS115[pHIL-660] #23 which had the greatest yield of Kex2 production. 

The Kex2 yield of GS115[pHIL-660] #23 was then examined. First, a colony was seeded into 10 ml of BMGY 
medium and cultured at 30°C for 2 days with shaking. Cells obtained by centrifugation of 10 ml of the culture medium 
were suspended in 2 ml of BMMY medium and further cultured at 25°C for 2 days with shaking, after which the Kex2 
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activity of the culture supernatant was measured. As a result, the production of Kex2-660 was found to be about 1 350 
KU (corresponding to about 160 ng) per 1 ml of culture medium. 

Thus it was demonstrated that Kex2-660 can be produced in large amounts even in an expression system in which 
the host is Pichia pastoris, methylotrophic yeast other than Candida boidinii. 

Fig. 27 shows the method of constructing the Kex2-660-expressing plasmid pHIL-660 with Pichia pastoris as the 
host. 
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SEQUENCE LISTING 

SEQ ID N0:1 
SEQUENCE LENGTH: 2848 
SEQUENCE TYPE: Nucleic acid 
STRANDEDNESS: Double strand 
TOPOLOGY: Linear 
MOLECULE TYPE: cDNA 
SOURCE: 

ORGANISM: Saccharomyces cerevisiae 
15 STRAIN: X2180-IB 

SEQUENCE 

TGCATAATTC TGTCATAAGC CTGTTCTTTT TCCTGGCTTA AACATCCCGT TTTGTAAAAG 60 
20 AGAAATCTAT TCCACATATT TCATTCATTC GGCTACCATA CTAAGGATAA ACTAATCCCG 120 

TTGTTTTTTG GCCTCGTCAC ATAATTATAA ACTACTAACC CATTATCAG ATG AAA GTG 178 

Met Lys Val 
1 

AGG AAA TAT ATT ACT TTA TGC TTT TGG TGG GCC TTT TCA ACA TCC GCT 226 
Arg Lys Tyr lie Thr Leu Cys Phe Trp Trp Ala Phe Ser Thr Ser Ala 

5 10 15 

CTT GTA TCA TCA CAA CAA ATT CCA TTG AAG GAC CAT ACG TCA CGA CAG 274 
Leu Val Ser Ser Gin Gin lie Pro Leu Lys Asp His Thr Ser Arg Gin 
20 25 30 35 

35 TAT TTT GCT GTA GAA AGC AAT GAA ACA TTA TCC GGC TTG GAG GAA ATG 322 

Tyr Phe Ala Val Glu Ser Asn Glu Thr Leu Ser Arg Leu Glu Glu Met 

AO 45 50 

CAT CCA AAT TGG AAA TAT GAA CAT GAT GTT CGA GGG CTA CCA AAC CAT 370 
His Pro Asn Trp Lys Tyr Glu His Asp Val Arg Gly Leu Pro Asn His 

55 60 65 

TAT GTT TTT TCA AAA GAG TTG CTA AAA TTG GGC AAA AGA TCA TCA TTA 418 
Tyr Val Phe Ser Lys Glu Leu Leu Lys Leu Gly Lys Arg Ser Ser Leu 

70 75 80 

GAA GAG TTA CAG GGG GAT AAC AAC GAC CAC ATA TTA TCT GTC CAT GAT 466 
so Glu Glu Leu Gin Gly Asp Asn Asn Asp His He Leu Ser Val His Asp 

85 90 95 

TTA TTC CCG CGT AAC GAC CTA TTT AAG AGA CTA CCG GTG CCT GCT CCA 514 
Leu Phe Pro Arg Asn Asp Leu Phe Lys Arg Leu Pro Val Pro Ala Pro 
100 105 110 115 
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CCA ATG GAC TCA AGC TTG TTA CCG GTA AAA GAA GCT GAG GAT AAA CTC 562 

Pro Met Asp Ser Ser Leu Leu Pro Val Lys Glu Ala Glu Asp Lys Leu 

120 125 130 

AGC ATA AAT GAT CCG CTT TTT GAG AGG CAG TGG CAC TTG GTC AAT CCA 610 

Ser lie Asn Asp Pro Leu Phe Glu Arg Gin Trp His Leu Val Asn Prcf 

135 140 145 

AGT TTT CCT GGC AGT GAT ATA AAT GTT CTT GAT CTG TGG TAC AAT AAT 658 

Ser Phe Pro Gly Ser Asp He Asn Val Leu Asp Leu Trp Tyr Asn Asn 
150 155 160 

15 ATT ACA GGC GCA GGG GTC GTG GCT GCC ATT GTT GAT GAT GGC CTT GAC 706 

He Thr Gly Ala Gly Val Val Ala Ala He Val Asp Asp Gly Leu Asp 
165 170 175 

20 TAC GAA AAT GAA GAC TTG AAG GAT AAT TTT TGC GCT GAA GGT TCT TGG 754 

Tyr Glu Asn Glu Asp Leu Lys Asp Asn Phe Cys Ala Glu Gly Ser Trp 

180 185 190 195 

GAT TTC AAC GAC AAT ACC AAT TTA CCT AAA CCA AGA TTA TCT GAT GAC 802 

Asp Phe Asn Asp Asn Thr Asn Leu Pro Lys Pro Arg Leu Ser Asp Asp 

200 205 210 

TAC CAT GGT ACG AGA TGT GCA GGT GAA ATA GCT GCC AAA AAA GGT AAC 850 

Tyr His Gly Thr Arg Cys Ala Gly Glu He Ala Ala Lys Lys Gly Asn 

215 220 225 

AAT TTT TGC GGT GTC GGG GTA GGT TAC AAC GCT AAA ATC TCA GGC ATA 898 

35 Asn Phe Cys Gly Val Gly Val Gly Tyr Asn Ala Lys He Ser Gly He 

230 235 240 

AGA ATC TTA TCC GGT GAT ATC ACT ACG GAA GAT GAA GCT GCG TCC TTG 946 

Arg He Leu Ser Gly Asp He -Thr Thr Glu Asp Glu ALa Ala Ser Leu 

245 250 255 

ATT TAT GGT CTA GAC GTA AAC GAT ATA TAT TCA TGC TCA TGG GGT CCC 994 

He Tyr Gly Leu Asp Val Asn Asp He Tyr Ser Cys Ser Trp Gly Pro 

260 265 270 275 

GCT GAT GAC GGA AGA CAT TTA CAA GGC CCT AGT GAC CTG GTG AAA AAG 1042 

Ala Asp Asp Gly Arg His Leu Gin Gly Pro Ser Asp Leu Val Lys Lys 

280 285 290 

GCT TTA GTA AAA GGT GTT ACT GAG GGA AGA GAT TCC AAA GGA GCG ATT 1090 

Ala Leu Val Lys Gly Val Thr Glu Gly Arg Asp Ser Lys Gly Ala He 
55 295 300 305 
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TAC GTT TTT GCC ACT GGA AAT GGT GGA ACT CGT GGT GAT AAT TGC AAT 1138 
Tyr Val Phe Ala Ser Gly Asn Gly Gly Thr Arg Gly Asp Asn Cys Asn 

310 315 320 

TAC GAC GGC TAT ACT AAT TCC ATA TAT TCT ATT ACT ATT GGG GCT ATT 1186 
Tyr Asp Gly Tyr Thr Asn Ser lie Tyr Ser lie Thr lie Gly Ala lie 

325 330 335 

GAT CAC AAA GAT CTA CAT CCT CCT TAT TCC GAA GGT TGT TCC GCC GTC 1234 
Asp His Lys Asp Leu His Pro Pro Tyr Ser Glu Gly Cys Ser Ala Val 
340 345 350 355 

ATG GCA GTC ACG TAT TCT TCA GGT TCA GGC GAA TAT ATT CAT TCG AGT 1282 
Met Ala Val Thr Tyr Ser Ser Gly Ser Gly Glu Tyr lie His Ser Ser 
360 365 370 

20 GAT ATC AAC GGC AGA TGC AGT AAT AGC CAC GGT GGA ACG TCT GCG GCT 1330 

Asp lie Asn Gly Arg Cys Ser Asn Ser His Gly Gly Thr Ser Ala Ala 

375 380 385 

GCT CCA TTA GCT GCC GGT GTT TAC ACT TTG TTA CTA GAA GCC AAC CCA 1378 
Ala Pro Leu Ala Ala Gly Val Tyr Thr Leu Leu Leu Glu Ala Asn Pro 

390 395 400 

AAC CTA ACT TGG AGA GAC GTA CAG TAT TTA TCA ATC TTG TCT GCG GTA 1426 
Asn Leu Thr Trp Arg Asp Val Gin Tyr Leu Ser lie Leu Ser Ala Val 

405 410 415 

GGG TTA GAA AAG AAC GCT GAC GGA GAT TGG AGA GAT AGC GCC ATG GGG 14 74 
35 Gly Leu Glu Lys Asn Ala Asp Gly Asp Trp Arg Asp Ser Ala Met Gly 

420 425 430 435 

AAG AAA TAC TCT CAT CGC TAT GGC TTT GGT AAA ATC GAT GCC CAT AAG 1522 
40 Lys Lys Tyr Ser His Arg Tyr Gly Phe Gly Lys lie Asp Ala His Lys 

440 445 450 

TTA ATT GAA ATG TCC AAG ACC TGG GAG AAT GTT AAC GCA CAA ACC TGG 1570 
Leu He Glu Met Ser Lys Thr Trp Glu Asn Val Asn Ala Gin Thr Trp 

455 460 465 

TTT TAC CTG CCA ACA TTG TAT GTT TCC CAG TCC ACA AAC TCC ACG GAA 1618 
Phe Tyr Leu Pro Thr Leu Tyr Val Ser Gin Ser Thr Asn Ser Thr Glu 
50 4 7 0 4 7 5 4 8 0 

GAG ACA TTA GAA TCC GTC ATA ACC ATA TCA GAA AAA AGT CTT CAA GAT 1666 
Glu Thr Leu Glu Ser Val He Thr He Ser Glu Lys Ser Leu Gin Asp 
55 485 490 495 
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GCT AAC TTC AAG AGA ATT GAG CAC CTC ACG GTA ACT GTA GAT ATT GAT 1714 
Ala Asn Phe Lys Arg He Glu His Val Thr Val Thr Val Asp He Asp 
500 505 510 515 

ACA GAA ATT AGG GGA ACT ACG ACT GTC GAT TTA ATA TCA CCA GCG GGG 1762 
Thr Glu He Arg Gly Thr Thr Thr Val Asp Leu He Ser Pro Ala Gly 

520 525 530 

ATA ATT TCA AAC CTT GGC GTT GTA AGA CCA AGA GAT GTT TCA TCA GAG 1810 
He lie Ser Asn Leu Gly Val Val Arg Pro Arg Asp Val Ser Ser Glu 

535 540 545 

GGA TTC AAA GAC TGG ACA TTC ATG TCT GTA GCA CAT TGG GGT GAG AAC 1858 
Gly Phe Lys Asp Trp Thr Phe Met Ser Val Ala His Trp Gly Glu Asn 

550 555 560 

GGC GTA GGT GAT TGG AAA ATC AAG GTT AAG ACA ACA GAA AAT GGA CAC 1906 
Gly Val Gly Asp Trp Lys He Lys Val Lys Thr Thr Glu Asn Gly His 

565 570 575 

AGG ATT GAC TTC CAC AGT TGG AGG CTG AAG CTC TTT GGG GAA TCC ATT 1954 
Arg He Asp Phe His Ser Trp Arg Leu Lys Leu Phe Gly Glu Ser He 
580 585 590 595 

GAT TCA TCT AAA ACA GAA ACT TTC GTC TTT GGA AAC GAT AAA GAG GAG 2002 
Asp Ser Ser Lys Thr Glu Thr Phe Val Phe Gly Asn Asp Lys Glu Glu 

600 605 610 

GTT GAA CCA GCT GCT ACA GAA AGT ACC GTA TCA CAA TAT TCT GCC AGT 2050 
35 Val Glu Pro Ala Ala Thr Glu Ser Thr Val Ser Gin Tyr Ser Ala Ser 

615 620 625 

TCA ACT TCT ATT TCC ATC AGC GCT ACT TCT ACA TCT TCT ATC TCA ATT 2098 
Ser Thr Ser He Ser He Ser Ala Thr Ser Thr Ser Ser He Ser He 

630 635 640 

GGT GTG GAA ACG TCG GCC ATT CCC CAA ACG ACT ACT GCG AGT ACC GAT 2146 
Gly Val Glu Thr Ser Ala He Pro Gin Thr Thr Thr Ala Ser Thr Asp 

645 650 655 

CCT GAT TCT GAT CCA AAC ACT CCT AAA AAA CTT TCC TCT CCT AGG CAA 2194 
Pro Asp Ser Asp Pro Asn Thr Pro Lys Lys Leu Ser Ser Pro Arg Gin 
660 665 670 675 

GCC ATG CAT TAT TTT TTA ACA ATA TTT TTG ATT GGC GCC ACA TTT TTG 2242 
Ala Met His Tyr Phe Leu Thr He Phe Leu He Gly Ala Thr Phe Leu 
680 685 690 
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GTG TTA TAC TTC ATG TTT TTT ATG AAA TCA AGG AGA AGG ATC AG A AGG 2290 
Val Leu Tyr Phe Met Phe Phe Met Lys Ser Arg Arg Arg lie Arg Arg 

695 700 705 

TCA AGA GCG GAA ACG TAT GAA TTC GAT ATC ATT GAT ACA GAC TCT GAG 2338 
Ser Arg Ala Glu Thr Tyr Glu Phe Asp lie He Asp Thr Asp Ser Glu 

710 715 720 

TAC GAT TCT ACT TTG GAC AAT GGA ACT TCC GGA ATT ACT GAG CCC GAA 2386 
Tyr Asp Ser Thr Leu Asp Asn Gly Thr Ser Gly He Thr Glu Pro Glu 
725 730 735 

,s GAG GTT GAG GAC TTC GAT TTT GAT TTG TCC GAT GAA GAC CAT CTT GCA 2434 

Glu Val Glu Asp Phe Asp Phe Asp Leu Ser Asp Glu Asp His Leu Ala 
740 745 750 755 

20 AGT TTG TCT TCA TCA GAA AAC GGT GAT GCT GAA CAT ACA ATT GAT AGT 2482 

Ser Leu Ser Ser Ser Glu Asn Gly Asp Ala Glu His Thr He Asp Ser 

760 765 770 

GTA CTA ACA AAC GAA AAT CCA TTT AGT GAC CCT ATA AAG CAA AAG TTC 2530 
Val Leu Thr Asn Glu Asn Pro Phe Ser Asp Pro He Lys Gin Lys Phe 

775 780 785 

CCA AAT GAC GCC AAC GCA GAA TCT GCT TCC AAT AAA TTA CAA GAA TTA 2578 
Pro Asn Asp Ala Asn Ala Glu Ser Ala Ser Asn Lys Leu Gin Glu Leu 

790 795 800 

CAG CCT GAT GTT CCT CCA TCT TCC GGA CGA TCG 2611 
35 Gin Pro Asp Val Pro Pro Ser Ser Gly Arg Ser 

805 810 814 

TGATTCGATA TGTACAGAAA GCTTCAAATT ACAAAATAGC ATTTTTTTCT TATAGATTAT 2671 
AATACTCTCT CAT ACG TATA CGTATATGTG TATATGATAT ATAAACAAAC ATTAATATCC 2731 
TATTCCTTCC GTTTGAAATC CCTATGATGT ACTTTGCATT GTTTGCACCC GCGAATAAAA 2791 
TGAAAACTCC GAACCGATAT ATCAAGCACA TAAAAGGGGA GGGTCCAATT AATGCAT 2848 
SEQ ID NO: 2 
SEQUENCE LENGTH: 139 
SEQUENCE TYPE: Amino acid 
TOPOLOGY: Linear 
MOLACULE TYPE: Peptide 
SEQUENCE 

Thr Met He Thr Asp Ser Leu Ala Val Val Leu Gin Arg Arg Asp Trp 
15 10 15 
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Glu Asn Pro Gly Val Thr Gin Leu Asn Arg Leu Ala Ala His Pro Pro 

20 25 30 

Phe Ala Ser Trp Arg Asn Ser Glu Glu Ala Arg Thr Asp Arg Pro Ser 

35 40 45 

Gin Gin Leu Arg Ser Leu Asn Gly Glu Trp Arg Phe Ala Trp Phe Pro 

50 55 60 

Ala Pro Glu Ala Val Pro Glu Ser Leu Leu Glu Ser Asp Leu Pro Glu 
65 70 75 80 

Ala Asp Thr Val Val Val Pro Ser Asn Trp Gin Met His Gly Tyr Asp 
15 85 90 95 

Ala Pro lie Tyr Thr Asn Val Thr Tyr Pro lie Thr Val Asn Pro Pro 
100 105 110 

2Q Phe Val Pro Thr Glu Asn Pro Thr Gly Ser Tyr Ser Leu Thr Phe Asn 

115 120 125 

Val Asp Glu Ser Trp Leu Gin Glu Gly Gin Thr 

130 135 
SEQ ID NO: 3 
SEQUENCE LENGTH: 84 
SEQUENCE TYPE: Amino acid 
30 TOPOLOGY: Linear 

MOLACULE TYPE: Peptide 
SEQUENCE 

Ser Val Ser Glu lie Gin Leu Met His Asn Leu Gly Lys His Leu Asn 

1 5 10 15 

Ser Met Glu Arg Val Glu Trp Leu Arg Lys Lys Leu Gin Asp Val His 

20 25 30 

Asn Phe Val Ala Leu Gly Ala Pro Leu Ala Pro Arg Asp Ala Gly Ser 

35 40 45 

Gin Arg Pro Arg Lys Lys Glu Asp Asn Val Leu Val Glu Ser His Glu 
45 50 55 60 

Lys Ser Leu Gly Glu Ala Asp Lys Ala Asp Val Asn Val Leu Thr Lys 
65 70 75 80 

Ala Lys Ser Gin 
84 

SEQ ID NO:4 
SEQUENCE LENGTH: 14 
SEQUENCE TYPE: Amino acid 
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TOPOLOGY: Linear 
MOLACULE TYPE: Peptide 
SEQUENCE 

Gly Gly Ser Ser Arg Val He Leu Gin Ala Cys Leu He Asn 
15 10 14 

SEQ ID NO: 5 
SEQUENCE LENGTH: 40 
SEQUENCE TYPE: Nucleic acid 
STRANDEDNESS: Single strand 
15 TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 
SEQUENCE 

AATTCATGAA ATCTGTTAAA AAGCGTTCTG TTTCTGAAAT 40 

SEQ ID NO: 6 
SEQUENCE LENGTH: 41 
SEQUENCE TYPE: Nucleic acid 

25 

STRANDEDNESS: Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

30 SEQUENCE 

TCAGCTGATG CATAACCTGG GCAAACACCT GAATAGCATG G 41 

SEQ ID NO: 7 
3S SEQUENCE LENGTH: 41 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS: Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

AACGCGTCGA GTGGCTGCGT AAGAAACTGC AGGACGTCCA C 41 
45 SEQ ID NO: 8 

SEQUENCE LENGTH: 41 
SEQUENCE TYPE: Nucleic acid 
so STRANDEDNESS: Single strand 

TOPOLOGY: Linear 
MOLECULE TYPE: Synthetic DNA 
SEQUENCE 

AACTTCGTTG CGCTGGGTGC ACCGCTGGCT CCACGTGATG C 41 
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SEQ ID NO: 9 

SEQUENCE LENGTH: 39 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS: Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

AGGATCCCAA CGTCCGCGTA AGAAAGAAGA TAACGTACT 

SEQ ID NO: 10 

SEQUENCE LENGTH: 40 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS: Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

GGTTGAATCT CATGAGAAAT CCCTGGGCGA AGCTGACAAA 

SEQ ID NO: 11 

SEQUENCE LENGTH: 40 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS: Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

GCCGATGTTA ACGTGCTGAC CAAAGCGAAA AGCCAGTAAG 

SEQ ID NO: 12 

SEQUENCE LENGTH: 33 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS: Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

TCGACTTACT GGCTTTTCGC TTTGGTCAGC ACG 
SEQ ID NO: 13 
SEQUENCE LENGTH: 40 
SEQUENCE TYPE: Nucleic acid 
STRANDEDNESS: Single strand 
TOPOLOGY: Linear 
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MOLECULE TYPE: Synthetic DNA 
SEQUENCE 

TTAACATCGG CTTTGTCAGC TTCGCCCAGG GATTTCTCAT 

SEQ ID NO: 14 

SEQUENCE LENGTH: 40 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS : Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

GAGATTCAAC CAGTACGTTA TCTTCTTTCT TACGCGGACG 

SEQ ID NO: 15 

SEQUENCE LENGTH: 41 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS: Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

TTGGGATCCT GCATCACGTG GAGCCAGCGG TGCACCCAGC G 

SEQ ID NO: 16 

SEQUENCE LENGTH: 41 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS: Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

CAACGAAGTT GTGGACGTCC TGCAGTTTCT TACGCAGCCA C 

SEQ ID NO: 17 

SEQUENCE LENGTH: 41 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS: Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

TCGACGCGTT CCATGCTATT CAGGTGTTTG CCCAGGTTAT G 
SEQ ID NO: 18 
SEQUENCE LENGTH: 4 6 
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SEQUENCE TYPE: Nucleic acid 

STRANDEDNE5S : Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

CATCAGCTGA ATTTCAGAAA CAGAACGCTT TTTAACAGAT TTCATG 

SEQ ID NO: 19 

SEQUENCE LENGTH: 24 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS : Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

TCGAGGTCGA CGGTACCGAG CTCG 

SEQ ID NO: 20 

SEQUENCE LENGTH: 24 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS: Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

AATTCGTGCT CGGTACCGTC GACC 

SEQ ID NO: 21 

SEQUENCE LENGTH: 24 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS: Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

AATTCGAGCT CGGTACCGTC GACC 

SEQ ID NO: 22 

SEQUENCE LENGTH: 24 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS: Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 
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TCGAGGTCGA CGGTACCGAG CTCG 

SEQ ID NO: 23 

SEQUENCE LENGTH: 36 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS : Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

TCGAGAAAGA AGAAGGCGTA AGCTTGGAAA AACGAT 

SEQ ID NO: 24 

SEQUENCE LENGTH: 30 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS: Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

CGTTTTTCCA AGCTTACGCC TTCTTCTTTC 
SEQ ID NO: 25 
SEQUENCE LENGTH: 33 
SEQUENCE TYPE: Nucleic acid 
STRANDEDNESS: Single strand 
TOPOLOGY: Linear 
MOLECULE TYPE: Synthetic DNA 
SEQUENCE 

GTTAAAAAGC GATCGGTTTC TGAAATTCAG CTG 

SEQ ID NO: 26 

SEQUENCE LENGTH: 34 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS: Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

GCACCGGTAC CTTAGAAGTT GTGGACGTCC TGCA 
SEQ ID NO: 27 
SEQUENCE LENGTH: 30 
SEQUENCE TYPE: Nucleic acid 
STRANDEDNESS: Single strand 
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TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

GCTAAGGAAG AATTCATGGA GAAAAAAATC 
SEQ ID NO: 28 
SEQUENCE LENGTH: 24 
SEQUENCE TYPE: Nucleic acid 
STRANDEDNESS : Single strand 
TOPOLOGY: Linear 
MOLECULE TYPE: Synthetic DNA 
SEQUENCE 

CTGCCTTAAA ACTCGAGCGC CCCG 

SEQ ID NO:29 

SEQUENCE LENGTH: 45 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS: Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

AAACTGCAGG ACGTCCACAA CTTCTAAGCG CTGGGTGCAC CGCGT 

SEQ ID NO: 30 

SEQUENCE LENGTH: 24 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS: Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

CATTAAAGCT TTGCGATGAT AAGC 

SEQ ID NO:31 

SEQUENCE LENGTH: 2 6 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS: Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

CGCACCGATC GCCCTTCCCA ACAGTT 
SEQ ID NO: 32 
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SEQUENCE LENGTH: 35 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS : Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

TTTCCCGGGC CTCCGTGGGA ACAAACGGCG GATTG 

SEQ ID NO: 33 

SEQUENCE LENGTH: 4 2 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS: Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

TTTCCCGGGA GGCCTTCTGT TAAAAAGCGG TCTGTTTCTG AA 

SEQ ID NO: 34 

SEQUENCE LENGTH: 18 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS: Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

CACCATCATC ACCCTGGA 

SEQ ID NO: 35 

SEQUENCE LENGTH: 18 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS: Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

TCCAGGGTGA TGATGGTG 

SEQ ID NO: 36 

SEQUENCE LENGTH: 37 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS: Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 
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SEQUENCE 

TAAGAATTCA TGAAAGTGAG GAAATATATT ACTTTAT 

SEQ ID NO: 37 

SEQUENCE LENGTH: 35 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS: Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

TAAGTCGACT TAAGGATCGG TACTCGCAGT AGTCG 

SEQ ID NO: 38 

SEQUENCE LENGTH: 35 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS: Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

TAAGTCGACT TAATAATGCA TGGCTTGCCT AGGAG 

SEQ ID NO: 39 

SEQUENCE LENGTH: 39 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS: Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

TAAGTCGACT TAGGCGCCAA TCAAAAATAT TGTTAAAAA 

SEQ ID NO:40 

SEQUENCE LENGTH: 39 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS: Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

TAAGTCGACT TACATAAAAA ACATGAAGTA TAACACCAA 

SEQ ID NO: 41 

SEQUENCE LENGTH: 35 

SEQUENCE TYPE: Nucleic acid ... 
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STRANDEDNES5 : Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

GCGGCCGCTT AAACATCCCG TTTTGTAAAA AGAGA 

SEQ ID NO: 42 

SEQUENCE LENGTH: 31 

SEQUENCE TYPE: Nucleic acid 

STRANDEDNESS: Single strand 

TOPOLOGY: Linear 

MOLECULE TYPE: Synthetic DNA 

SEQUENCE 

GGGGTCGACT TAAGAAGTTG AACTGGCAGA A 
SEQ ID NO: 43 
SEQUENCE LENGTH: 31 
SEQUENCE TYPE: Nucleic acid 
STRANDEDNESS: Single strand 
TOPOLOGY: Linear 
MOLECULE TYPE: Synthetic DNA 
SEQUENCE 

GGGGTCGACT TAAGAAGATG TAGAAGTAGC G 
SEQ ID NO:44 
SEQUENCE LENGTH: 30 
SEQUENCE TYPE: Nucleic acid 
STRANDEDNESS: Single strand 
TOPOLOGY: Linear 
MOLECULE TYPE: Synthetic DNA 
SEQUENCE 

GGGGTCGACT TAAATGGCCG ACGTTTCCAC 
SEQ ID NO: 45 
SEQUENCE LENGTH: 33 
SEQUENCE TYPE: Nucleic acid 
STRANDEDNESS: Single strand 
TOPOLOGY: Linear 
MOLECULE TYPE: Synthetic DNA 
SEQUENCE 

GGGGTCGACT TATGTTAAAA AATAATGCAT GGC 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: Suntory Limited 

{B) STREET: 1-40, Dojimahama 2-chome, Kita-ku, Osaka-shi 

(C) CITY: Osaka 

(E) COUNTRY: Japan 

(F) POSTAL CODE (ZIP): 530 Japan 

(ii) TITLE OF INVENTION: Process for production of secretory Kex2 
derivatives 

(iii) NUMBER OF SEQUENCES: 45 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: WordPerfect? and Patentln Release #1.0, Version 

#1.25 (EPO) 

(v) CURRENT APPLICATION DATA: 

APPLICATION NUMBER: EP 97301429.3 
(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: JP 8-73217 

(B) FILING DATE: 04-MAR-1996 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: JP 8-352580 

(B) FILING DATE: 16-DEC-1996 



(2) INFORMATION FOR SEQ ID NO:l 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2848 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Double strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: cDNA 
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20 



<vi) ORIGINAL SOURCE: 

(A) ORGANISM: Saccharomyces cercvisiae 

(B) STRAIN: X2180-IB 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l 

TGCATAATTC TGTCATAAGC CTGTTCTTTT TCCTGGCTTA AACATCCCGT TTTGTAAAAG 60 
AGAAATCTAT TCCACATATT TCATTCATTC GGCTACCATA CTAAGGATAA ACTAATCCCG 120 
TTGTTTTTTG GCCTCGTCAC ATAATTATAA ACTACTAACC CATTATCAG ATG AAA GTG 178 

Met Lys Val 
1 

AGG AAA TAT ATT ACT TTA TGC TTT TGG TGG GCC TTT TCA ACA TCC GCT 226 
Arg Lys Tyr He Thr Leu Cys Phe Trp Trp Ala Phe Ser Thr Ser Ala 

5 10 15 

CTT GTA TCA TCA CAA CAA ATT CCA TTG AAG GAC CAT ACG TCA CGA CAG 274 
Leu Val Ser ser Gin Gin He Pro Leu Lys Asp His Thr Ser Arg Gin 
20 25 30 35 

TAT TTT GCT GTA GAA AGC AAT GAA ACA TTA TCC CGC TTG GAG GAA ATG 322 
Tyr Phe Ala Val Glu Ser Asn Glu Thr Leu Ser Arg Leu Glu Glu Met 
25 40 45 50 

CAT CCA AAT TGG AAA TAT GAA CAT GAT GTT CGA GGG CTA CCA AAC CAT 370 
His Pro Asn Trp Lys Tyr Glu His Asp Val Arg Gly Leu Pro Asn His 
55 60 65 

30 TAT GTT TTT TCA AAA GAG TTG CTA AAA TTG GGC AAA AG A TCA TCA TTA 418 

Tyr Val Phe Ser Lys Glu Leu Leu Lys Leu Gly Lys Arg Ser Ser Leu 

70 75 80 

GAA GAG TTA CAG GGG GAT AAC AAC GAC CAC ATA TTA TCT GTC CAT GAT 466 
35 Glu Glu Leu Gin Gly Asp Asn Asn Asp His He Leu Ser Val His Asp 

85 90 95 

TTA TTC CCG CGT AAC GAC CTA TTT AAG AG A CTA CCG GTG CCT GCT CCA 514 
Leu Phe Pro Arg Asn Asp Leu Phe Lys Arg Leu Pro Val Pro Ala Pro 
100 105 110 115 

CCA ATG GAC TCA AGC TTG TTA CCG GTA AAA GAA GCT GAG GAT AAA CTC 562 
Pro Met Asp Ser Ser Leu Leu Pro Val Lys Glu Ala Glu Asp Lys Leu 

120 125 130 

AGC ATA AAT GAT CCG CTT TTT GAG AGG CAG TGG CAC TTG GTC AAT CCA 610 
45 Ser He Asn Asp Pro Leu Phe Glu Arg Gin Trp His Leu Val Asn Pro 

135 140 145 

AGT TTT CCT GGC AGT GAT ATA AAT GTT CTT GAT CTG TGG TAC AAT AAT 658 
Ser Phe Pro Gly Ser Asp lie Asn Val Leu Asp Leu Trp Tyr Asn Asn 
SO 150 155 160 

ATT ACA GGC GCA GGG GTC GTC GCT GCC ATT GTT GAT GAT GGC CTT GAC 706 
He Thr Gly Ala Gly Val Val Ala Ala He Val Asp Asp Gly Leu Asp 
165 170 175 
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TAC GAA AAT GAA CAC TTG AAG GAT AAT TTT TGC GCT GAA GGT TCT TGG 754 
Tyr Glu Asn Glu Asp Leu Lys Asp Asn Phe Cys Ala Glu Gly Ser Trp 
5 180 18S 190 195 

GAT TTC AAC GAC AAT ACC AAT TTA CCT AAA CCA AGA TTA TCT GAT GAC 802 
Asp Phe Asn Asp Asn Thr Asn Leu Pro Lys Pro Arg Leu Ser Asp Asp 
200 205 210 

10 TAC CAT GGT ACG AGA TGT GCA GGT GAA ATA GCT GCC AAA AAA GGT AAC 850 

Tyr His Gly Thr Arg Cys Ala Gly Glu lie Ala Ala Lys Lys Gly Asn 

215 220 225 

AAT TTT TGC GGT GTC GGG GTA GGT TAC AAC GCT AAA ATC TCA GGC ATA 898 
Asn Phe Cys Gly Val Gly Val Gly Tyr Asn Ala Lys lie Ser Gly He 

230 235 240 

AGA ATC TTA TCC GGT GAT ATC ACT ACG GAA GAT GAA GCT GCC TCC TTG 946 
Arg He Leu Ser Gly Asp He Thr Thr Glu Asp Glu Ala Ala Ser Leu 

245 250 255 

ATT TAT GGT CTA GAC GTA AAC GAT ATA TAT TCA TGC TCA TGG GGT CCC 994 
He Tyr Gly Leu Asp Val Asn Asp He Tyr Ser Cys Ser Trp Gly Pro 
260 265 270 275 

GCT GAT GAC GGA AGA CAT TTA CAA GGC CCT AGT GAC CTG GTG AAA AAG 1042 
25 Ala Asp Asp Gly Arg His Leu Gin Gly Pro Ser Asp Leu Val Lys Lys 

280 285 290 

GCT TTA GTA AAA GGT GTT ACT GAG GGA AGA GAT TCC AAA GGA GCG ATT 1090 
Ala Leu Val Lys Gly Val Thr Glu Gly Arg Asp Ser Lys Gly Ala He 
30 295 300 305 

TAC GTT TTT GCC AGT GGA AAT GGT GGA ACT CGT GGT GAT AAT TGC AAT 1138 
Tyr Val Phe Ala Ser Gly Asn Gly Gly Thr Arg Gly Asp Asn Cys Asn 

310 315 320 

TAC GAC GGC TAT ACT AAT TCC ATA TAT TCT ATT ACT ATT GGG GCT ATT 1186 

35 

Tyr Asp Gly Tyr Thr Asn Ser He Tyr Ser He Thr He Gly Ala He 

325 330 335 

GAT CAC AAA GAT CTA CAT CCT CCT TAT TCC GAA GGT TGT TCC GCC GTC 1234 
Asp His Lys Asp Leu His Pro Pro Tyr Ser Glu Gly Cys Ser Ala Val 
40 340 345 350 355 

ATG GCA GTC ACG TAT TCT TCA CGT TCA GGC GAA TAT ATT CAT TCG AGT 1282 
Met Ala Val Thr Tyr Ser Ser Gly Ser Gly Glu Tyr He His Ser Ser 
360 365 370 

4S GAT ATC AAC GGC AGA TGC AGT AAT AGC CAC GGT GGA ACG TCT GCG GCT 1330 

Asp He Asn Gly Arg Cys Ser Asn Ser His Gly Gly Thr Ser Ala Ala 

375 380 385 

GCT CCA TTA GCT GCC GGT GTT TAC ACT TTG TTA CTA GAA GCC AAC CCA 1378 
5o Ala Pro Leu Ala Ala Gly Val Tyr Thr Leu Leu Leu Glu Ala Asn Pro 

390 395 400 

AAC CTA ACT TGG AGA GAC GTA CAG TAT TTA TCA ATC TTG TCT GCG GTA 14 26 
Asn Leu Thr Trp Arg Asp Val Gin Tyr Leu Ser He Leu Ser Ala Val 
405 410 415 

55 
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GGG TTA GAA AAG AAC GCT GAC GGA GAT TGG AGA GAT AGC GCC ATG GGG 1474 
Gly Leu Glu Lys Asn Ala Asp Gly Asp Trp Arg Asp Ser Ala Met Gly 
5 420 425 430 435 

AAG AAA TAC TCT CAT CGC TAT GGC TTT GGT AAA ATC GAT GCC CAT AAG 1522 
Lys Lys Tyr Ser His Arg Tyr Gly Phe Gly Lys He Asp Ala His Lys 
440 445 450 

10 TTA ATT GAA ATG TCC AAG ACC TGG GAG AAT GTT AAC GCA CAA ACC TGG 1570 

Leu He Glu Met Ser Lys Thr Trp Glu Asn Val Asn Ala Gin Thr Trp 

455 460 465 

TTT TAC CTG CCA ACA TTG TAT GTT TCC CAG TCC ACA AAC TCC ACG GAA 1618 
15 Phe Tyr Leu Pro Thr Leu Tyr Val Ser Gin Ser Thr Asn Ser Thr Glu 

470 475 480 

GAG ACA TTA GAA TCC GTC ATA ACC ATA TCA GAA AAA AGT CTT CAA GAT 1666 
Glu Thr Leu Glu Ser Val He Thr He Ser Glu Lys Ser Leu Gin Asp 
20 4 8 5 4 9 0 4 9 5 

GCT AAC TTC AAG AGA ATT GAG CAC GTC ACG GTA ACT GTA GAT ATT GAT 1714 
Ala Asn Phe Lys Arg He Glu His Val Thr Val Thr Val Asp He Asp 
500 505 510 515 

ACA GAA ATT AGG GGA ACT ACG ACT GTC GAT TTA ATA TCA CCA GCG GGG 17 62 
Thr Glu He Arg Gly Thr Thr Thr Val Asp Leu He Ser Pro Ala Gly 

520 525 530 

ATA ATT TCA AAC CTT GGC GTT GTA AGA CCA AGA GAT GTT TCA TCA GAG 1810 
He He Ser Asn Leu Gly Val Val Arg Pro Arg Asp Val Ser Ser Glu 

535 540 545 

GGA TTC AAA GAC TGG ACA TTC ATG TCT GTA GCA CAT TGG GGT GAG AAC 1858 
Gly Phe Lys Asp Trp Thr Phe Met Ser Val Ala His Trp Gly Glu Asn 
550 555 560 

35 GGC GTA GGT GAT TGG AAA ATC AAG GTT AAG ACA ACA GAA AAT GGA CAC 1906 

Gly Val Gly Asp Trp Lys He Lys Val Lys Thr Thr Glu Asn Gly His 

565 570 575 

AGG ATT GAC TTC CAC AGT TGG AGG CTG AAG CTC TTT GGG GAA TCC ATT 1954 
40 Arg He Asp Phe His Ser Trp Arg Leu Lys Leu Phe Gly Glu Ser He 

580 585 590 595 

GAT TCA TCT AAA ACA GAA ACT TTC GTC TTT GGA AAC GAT AAA GAG GAG 2002 
Asp Ser Ser Lys Thr Glu Thr Phe Val Phe Gly Asn Asp Lys Glu Glu 

600 605 610 

GTT GAA CCA GCT GCT ACA GAA AGT ACC GTA TCA CAA TAT TCT GCC AGT 2050 
Val Glu Pro Ala Ala Thr Glu Ser Thr Val Ser Gin Tyr Ser Ala Ser 

615 620 625 

TCA ACT TCT ATT TCC ATC AGC GCT ACT TCT ACA TCT TCT ATC TCA ATT 2098 
Ser Thr Ser He Ser He Ser Ala Thr Ser Thr Ser Ser He Ser He 

630 635 640 

GGT GTG GAA ACG TCG GCC ATT CCC CAA ACG ACT ACT GCG AGT ACC GAT 2146 
Gly Val Glu Thr Ser Ala He Pro Gin Thr Thr Thr Ala Ser Thr Asp 
645 650 655 
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CCT GAT TCT GAT CCA AAC ACT CCT AAA AAA CTT TCC TCT CCT AGG CAA 2194 
Pro Asp Ser Asp Pro Asn Thr Pro Lys Lys Leu Ser Ser Pro Arg Gin 
660 665 670 675 

GCC ATG CAT TAT TTT TTA ACA ATA TTT TTG ATT GGC GCC ACA TTT TTG 2242 
Ala Met His Tyr Phe Leu Thr lie Phe Leu lie Gly Ala Thr Phe Leu 

680 685 690 

GTG TTA TAC TTC ATG TTT TTT ATG AAA TCA AGG AG A AGG ATC AG A AGG 2290 
Val Leu Tyr Phe Met Phe Phe Met Lys Ser Arg Arg Arg lie Arg Arg 

695 700 705 

TCA AG A GCG GAA ACG TAT GAA TTC GAT ATC ATT GAT ACA GAC TCT GAG 2338 
Ser Arg Ala Glu Thr Tyr Glu Phe Asp lie lie Asp Thr Asp Ser Glu 

710 715 720 

TAC GAT TCT ACT TTG GAC AAT GGA ACT TCC GGA ATT ACT GAG CCC GAA 2386 
Tyr Asp Ser Thr Leu Asp Asn Gly Thr Ser Gly lie Thr Glu Pro Glu 

725 730 735 

GAG GTT GAG GAC TTC GAT TTT GAT TTG TCC GAT GAA GAC CAT CTT GCA 2434 
Glu Val Glu Asp Phe Asp Phe Asp Leu Ser Asp Glu Asp His Leu Ala 
740 745 750 755 

25 AGT TTG TCT TCA TCA GAA AAC GGT GAT GCT GAA CAT ACA ATT GAT AGT 2482 

Ser Leu Ser Ser Ser Glu Asn Gly Asp Ala Glu His Thr lie Asp Ser 

760 765 770 

GTA CTA ACA AAC GAA AAT CCA TTT AGT GAC CCT ATA AAG CAA AAG TTC 2530 
30 val Leu Thr Asn Glu Asn Pro Phe Ser Asp Pro lie Lys Gin Lys Phe 

775 780 785 

CCA AAT GAC GCC AAC GCA GAA TCT GCT TCC AAT AAA TTA CAA GAA TTA 2578 
Pro Asn Asp Ala Asn Ala Glu Ser Ala Ser Asn Lys Leu Gin Glu Leu 
35 790 795 800 

CAG CCT GAT GTT CCT CCA TCT TCC GGA CGA TCG 2611 
Gin Pro Asp Val Pro Pro Ser Ser Gly Arg Ser 
805 810 814 

40 TG ATTCG AT A TGTACAGAAA GCTTCAAATT ACAAAATAGC ATTTTTTTCT TATAGATTAT 2671 

AATACTCTCT CAT ACG TATA CGTATATGTG TATATGATAT ATAAACAAAC ATTAATATCC 2731 
TATTCCTTCC GTTTGAAATC CCTATGATGT ACTTTGCATT GTTTGCACCC GCGAATAAAA 2791 
TGAAAACTCC GAACCGATAT ATCAAGCACA TAAAAGGGGA GGGTCCAATT AATGCAT 2848 



45 



(2) INFORMATION FOR SEQ ID NO: 2 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 139 amino acids 

(B) TYPE: Amino acid 
(D) TOPOLOGY: Linear 

<ii) MOLECULE TYPE: Peptide 
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(xi) SEQUENCE DESCRIPTION: SEC; ID NO; 2 

Thr Met lie Thr Asp Ser Leu Ala Val Val Leu Gin Arg Arg Asp Trp 

15 10 15 

Glu Asn Pro Gly Val Thr Gin Leu Asn Arg Leu Ala Ala His Pro Pro 

20 25 30 

Phe Ala Ser Trp Arg Asn Ser Glu Glu Ala Arg Thr Asp Arg Pro Ser 

35 40 45 

Gin Gin Leu Arg Ser Leu Asn Gly Glu Trp Arg Phe Ala Trp Phe Pro 

50 55 60 

Ala Pro Glu Ala Val Pro Glu Ser Leu Leu Glu Ser Asp Leu Pro Glu 
65 70 75 80 

Ala Asp Thr Val Val Val Pro Ser Asn Trp Gin Met His Gly Tyr Asp 
85 90 95 

20 fti a p ro hq Tyr Thr Asn Val Thr Tyr Pro lie Thr Val Asn Pro Pro 

100 105 110 

Phe Val Pro Thr Glu Asn Pro Thr Gly Ser Tyr Ser Leu Thr Phe Asn 
115 120 125 

25 Val Asp Glu Ser Trp Leu Gin Glu Gly Gin Thr 

130 135 

(2) INFORMATION FOR SEQ ID NO: 3 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 84 amino acids 

(B) TYPE: Amino acid 

35 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3 



Ser Val Ser Glu lie Gin Leu Met His Asn Leu Gly Lys His Leu Asn 

15 10 15 

Ser Met Glu Arg Val Glu Trp Leu Arg Lys Lys Leu Gin Asp Val His 
45 20 25 30 

Asn Phe Val Ala Leu Gly Ala Pro Leu Ala Pro Arg Asp Ala Gly Ser 

35 40 45 

Gin Arg Pro Arg Lys Lys Glu Asp Asn Val Leu Val Glu Ser His Glu 
50 5 0 5 5 60 

Lys Ser Leu Gly Glu Ala Asp Lys Ala Asp Val Asn Val Leu Thr Lys 
65 70 75 80 

Ala Lys Ser Gin 
55 84 
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(2) INFORMATION FOR SEQ ID NO: 4 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 
10 (B) TYPE: Amino acid 

(D) TOPOLOGY: Linear 
MOLACULE TYPE: Peptide 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4 

Gly Gly Ser Ser Arg Val He Leu Gin Ala Cys Leu He Asn 
15 10 14 

20 

(2) INFORMATION FOR SEQ ID NO: 5 



25 ( i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 
30 (D) TOPOLOGY: Linear 

(ii> MOLECULE TYPE: Synthetic DNA 



35 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5 

AATTCATGAA ATCTGTTAAA AAGCGTTCTG TTTCTGAAAT 40 
(2) INFORMATION FOR SEQ ID NO: 6 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

<ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6 

TCAGCTGATG CATAACCTGG GCAAACACCT GAATAGCATG G 41 
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(2) INFORMATION FOR SEQ ID NO: 7 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 

(B) TYPE: Nucleic acid 

<C) STRANDEDNESS: Single strand 
(D) TOPOLOGY: Linear 
(ii) MOLECULE TYPE : Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7 

AACGCGTCGA GTGGCTGCGT AAG AAACTGC AGGACGTCCA C 

(2) INFORMATION FOR SEQ ID NO: 8 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 41 base pairs 
<B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8 
AACTTCGTTG CGCTGGGTGC ACCGCTGGCT CCACGTGATG C 
(2) INFORMATION FOR SEQ ID NO: 9 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9 
AGGATCCCAA CGTCCGCGTA AG A A AG AAG A TAACGTACT 



42 



EP0 794 254 A2 



(2) INFORMATION FOR SEQ ID NO: 10 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 40 base pairs 

(6) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10 
GGTTGAATCT CATGAGAAAT CCCTGGGCGA AGCTGACAAA 
(2) INFORMATION FOR SEQ ID NO: 11 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11 

GCCGATGTTA ACGTGCTGAC CAAAGCGAAA AGCCAGTAAG 

(2) INFORMATION FOR SEQ ID NO: 12 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

( 8 ) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12 
TCGACTTACT GGCTTTTCGC TTTGGTCAGC ACG 
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(2) INFORMATION FOR SEQ ID NO: 13 

5 

(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 40 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13 



TTAACATCGG CTTTGTCAGC TTCGCCCAGG GATTTCTCAT 40 

20 

(2) INFORMATION FOR SEQ ID NO: 14 



(i) SEQUENCE CHARACTERISTICS: 

25 

(A) LENGTH: 40 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

30 

(ii) MOLECULE TYPE: Synthetic DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 

35 

GAGATTCAAC CAGTACGTTA TCTTCTTTCT TACGCGGACG 40 



(2) INFORMATION FOR SEQ ID NO: 15 

40 

<i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 41 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

50 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15 



TTGGGATCCT GCATCACGTG GAGCCAGCCC TGCACCCAGC G 41 

55 
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(2) INFORMATION FOR SEQ ID NO: 16 

(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 41 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(0) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16 

CAACGAAGTT GTGGACGTCC TGCAGTTTCT TACGCAGCCA C 41 
(2) INFORMATION FOR SEQ ID NO: 17 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:17 

TCGACGCGTT CCATGCTATT CAGGTGTTTG CCCAGGTTAT G 41 
(2) INFORMATION FOR SEQ ID NO: 18 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 base pairs 

(B) TYPE: Nucleic acid 

<C) STRANDEDNESS: Single strand 
(D) TOPOLOGY: Linear 
(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18 

CATCAGCTGA ATTTCAGAAA CAGAACGCTT TTTAACAGAT TTCATG 46 
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(2) INFORMATION FOR SEQ ID NO: 19 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19 

TCGAGGTCGA CGGTACCGAG CTCG 

(2) INFORMATION FOR SEQ ID NO: 20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20 
AATTCGTGCT CGGTACCGTC GACC 

(2) INFORMATION FOR SEQ ID NO: 21 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21 
AATTCGAGCT CGGTACCGTC GACC 
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(2) INFORMATION FOR SEQ ID NO: 22 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22 

TCGAGGTCGA CGGTACCGAG CTCG 24 
(2) INFORMATION FOR SEQ ID NO: 23 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 
{D ) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23 

TCGAGAAAGA AGAAGGCGTA AGCTTGGAAA AACGAT 36 
(2) INFORMATION FOR SEQ ID NO: 24 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24 

CGTTTTTCCA AGCTTACGCC TTCTTCTTTC 30 
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(2) INFORMATION FOR SEQ ID NO: 25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 
<D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25 
GTTAAAAAGC GATCGGTTTC TGAAATTCAG CTG 
(2) INFORMATION FOR SEQ ID NO: 26 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26 
GCACCGGTAC CTTAGAAGTT GTGGACGTCC TGCA 
(2) INFORMATION FOR SEQ ID NO: 27 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27 
GCTAAGGAAG AATTCATGGA GAAAAAAATC 
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(2) INFORMATION FOR SEQ ID NO: 28 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28 

CTGCCTTAAA ACTCGAGCGC CCCG 24 
(2) INFORMATION FOR SEQ ID NO: 29 

(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 45 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 
30 (D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29 
AAACTGCAGG ACGTCCACAA CTTCTAAGCG CTGGGTGCAC CGCGT 45 
(2) INFORMATION FOR SEQ ID NO: 30 

40 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

4S (B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D> TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:30 

CATTAAAGCT TTGCGATGAT AAGC 24 
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(2) INFORMATION FOR SEQ ID NO: 31 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31 

CGCACCGATC GCCCTTCCCA ACAGTT 

(2) INFORMATION FOR SEQ ID NO: 32 

(A) LENGTH: 35 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32 
TTTCCCGGGC CTCCGTGGGA ACAAACGGCG GATTG 
(2) INFORMATION FOR SEQ ID NO: 33 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33 
TTTCCCGGGA GGCCTTCTGT TAAAAAGCGG TCTGTTTCTG AA 
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(2) INFORMATION FOR SEQ ID NO: 34 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS : Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34 

CACCATCATC ACCCTGGA 

(2) INFORMATION FOR SEQ ID NO: 35 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE : Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35 
TCCAGGGTGA TGATGGTG 
(2) INFORMATION FOR SEQ ID NO: 36 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:36 
TAAGAATTCA TGAAAGTGAG GAAATATATT ACTTTAT 
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(2) INFORMATION FOR SEQ ID NO: 37 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37 

TAAGTCGACT TAAGGATCGG TACTCGCAGT AGTCG 35 
(2) INFORMATION FOR SEQ ID NO: 38 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38 

TAAGTCGACT TAATAATGCA TGGCTTGCCT AG GAG 35 
(2) INFORMATION FOR SEQ ID NO: 39 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39 

TAAGTCGACT TAGGCGCCAA TCAAAAATAT TGTTAAAAA 39 
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(2) INFORMATION FOR SEQ ID NO: 40 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40 
TAAGTCGACT TACATAAAAA ACATGAAGTA TAACACCAA 
(2) INFORMATION FOR SEQ ID NO: 41 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41 
GCGGCCGCTT AAACATCCCG TTTTGTAAAA AGAGA 
(2) INFORMATION FOR SEQ ID NO: 42 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42 
GGGGTCGACT TAAGAAGTTG AACTGGCAGA A 
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(2) INFORMATION FOR SEQ ID NO: 43 



5 



<i) SEQUENCE CHARACTERISTICS: 



10 



(A) LENGTH: 31 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 



15 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:43 



GGGGTCGACT TAAGAAGATG TAGAAGTAGC G 



31 



(2) INFORMATION FOR SEQ ID NO: 44 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44 



(2) INFORMATION FOR SEQ ID NO: 45 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single strand 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Synthetic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45 

GGGGTCGACT TATGTTAAAA AATAATGCAT GGC 33 
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GGGGTCGACT TAAATGGCCG ACGTTTCCAC 
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Claims 

I . A protein with Kex2 protease enzyme activity obtainable by transforming a methylotrophic yeast with an expression 
vector containing DN A coding for a natural amino acid sequence whose N-terminus is the Met at amino acid 1 and 

5 whose C-terminus is one of the amino acids between amino acids 618 (inclusive) and 698 (inclusive) of SEQ ID 

NO. 1 , or an amino acid sequence modified by substitution, deletion or addition of one or more amino acids in said 
natural amino acid sequence, and then culturing the resulting transformant and recovering the protein from said 
culture. 

10 2. A protein according to claim 1, wherein the C-terminal amino acid of said natural amino acid sequence is any 
amino acid between amino acids 630 (inclusive) and 688 (inclusive) of SEQ ID NO.I. 

3. A protein according to claim 1, wherein the C-terminal amino acid of said natural amino acid sequence is any 
amino acid between amino acids 630 (inclusive) and 682 (inclusive) of SEQ ID NO.1. 

15 

4. A protein according to claim 1, wherein the C-terminal amino acid of said natural amino acid sequence is any 
amino acid between amino acids 630 (inclusive) and 679 (inclusive) of SEQ ID No.1. 

5. DNA coding for a protein according to any one of claims t to 4. 

20 

6. An expression vector comprising DNA according to claim 5. 

7. A transformant obtained by transforming a methylotrophic yeast with an expression vector according to claim 6. 

25 8. A transformant according to claim 7, wherein said methylotrophic yeast is yeast belonging to Pichia, Hansenula 
or Candida. 

9. A transformant according to claim 8, wherein said yeast is Pichia pastoris, Hansenula polymorphs or Candida 
boidinii. 

30 

10. A process for producing a protein according to any one of claims 1 to 4, comprising the steps of culturing a trans- 
formant according to any one of claims 7 to 9 and recovering the peptide from said culture. 

II. A process according to claim 10, wherein said peptide is recovered from the culture supernatant by anion exchange 
35 chromatography and hydrophobic chromatography. 

12. A process for excision of a desired peptide from a chimeric protein, comprising the steps of allowing a protein 
according to any one of claims 1 to 4 to act on a chimeric protein containing the desired peptide and the sequence 
Arg-Arg, Lys-Arg or Pro-Arg positioned adjacent to the N-terminus of the desired peptide, and obtaining the desired 

40 peptide. 

13. The use of a protein according to any of claims 1 to 4 for excision of a desired peptide from a chimeric protein 
containing the desired peptide and the sequence Arg-Arg, Lys-Arg or Pro-Arg positioned adjacent to the N-terminus 
of the desired peptide, and obtaining the desired peptide. 
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Fig.5 
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Fig.7 
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Fig. 8 
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Fig.12 
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Fig. 13 
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Fig.15 
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Fig.17 
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Fig.19 
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Fig.21 
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Fig. 22 
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Fig.23 
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Fig. 25 
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Fig. 26 
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Fig.27 
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